Perlfect Solutions
 

[Perlfect-search] RE: Http start url

david groeling pokerup2001 at yahoo.com
Wed Dec 15 13:56:03 GMT 2004
I can get it to scan but it scans 1 file and indexes 0
files 0 content
as stated in previous message. I have tried various
options on how to do this Ive changed all settings
over and over again in a trial and error situation.
I got a message from one of the others subscribers to
perlfect search. Suggesting to set only the
" limit the URL" setting. OK Ive done that and left
the start URL blank. I can get this to scan like that,
but it seems really not to work and scans as if its on
your local file system not on or via the web.
  Let me try to see if i can get some more information
while explaining what it is i understand from the
install and setup instructions.

So i read the instructions and it suggests that if you
use dynamic pages such as asp or PhP files you will
not get the content from those pages alone you will
also get the code behind those pages.
So the instructions say that if you want to index
files that are active like asp or PhP. You must use
the HTTP start URL setting. 
Now i can confirm that if i use the local file system
scan that i truly do indeed get the page code indexed.
That is not a good option at all for others to be
looking at code on pages rather then content. So it
all comes down to how to index the php, asp pages
without getting the code indexed.
Like i said i have read reread the instructions at
least 100 times and tried every available option in
the settings to get active pages to index correctly.
So after looking at the archives for quite some time
over the last 5 days. I see that others have similar
problems with start URL.
I think but not sure that a true example of a working
set up would be needed. One that people can see
exactly how the code is set up. I know that there are
generic examples on one of the instruction page's. But
they are set up in  a way that is not like the conf.pl
file. They are separated with white spaces and the
generic code examples are not working examples.
  On to the second part of this. If Indexing php or
asp pages are not truly done rather meaning that if
only HTML content is pulled from php or asp pages or
page code. I'm not sure that the instructions go into
that detail well enough. As many php or asp pages pull
there content from either databases or XML or other
sources for there content.
 Is it really possible to index a php or asp page
correctly? On a side note but may be relating to part
of either my misinterpretation of how this is to work.
If you index files with say images on them. And you do
a search for say flowers on the search page and
naturally knowing that there are flowers indexed in
your pages. Your search page will come up with several
pages of options to view. Now these pages with flowers
in them also have images either embedded in them or
linked to them. If you view the pages normally you
will see the flower pictures. If you look at the pages
through the highlight text option for flowers on those
pages no images show up. So as with a php or asp page
that pulls its data from other sources so do regular
HTML pages if images or other type of content is on
those pages. But they will not show up. I have tried
to index image files jpg and gif to see if that would
correct the problem but it does not correct that
issue.
   So that is a fairly good description of what I'm
trying to do. Am i wrong in my interpretation of the
instructions that it will actually do these things correctly?


                
__________________________________ 
Do you Yahoo!? 
Jazz up your holiday email with celebrity designs. Learn more. 
http://celebrity.mail.yahoo.com