|
|
[Perlfect-search] Dynamic Page (PHP pages) Indexing Problems
Michael Borck perlfect-search@perlfect.com
Thu, 26 Feb 2004 12:14:33 +0800
Hi Daniel,
Thanks for the quick response. I have tried your suggestions but still
have problems with search.cgi open the data files. I have provided
more detail below.
> that page does not only use https (of which I'm not sure if the module
> we
> use supports it), but also asks for a password here. So it can't be
> indexed automatically. About the permission error when indexing on the
I forgot to mention that machine with the domain are not prompted for a
password, but I guess the protocol is sufficiently different that it is
causing the error. I tried to install Crypt::SSLeary (which is
apparent SSL glue) automatically invoke/loaded by LWP but it didn't
work for me (and I have reached my limit of knowledge). So I am back
to solving the file system "cannot open" problem.
> command line: 755 is okay, but who is the owner of the files? Probably
> not
> you, because the owner was set to someone else when you indexed via
> index_form.html. So I suggest to delete all files in the "data"
> directory
> on the command line and try again (from the command line).
Prior to index via the filesystem I deleted all the files in data. I
then run the indexer, and manually chmod 755 all the file. I have even
tried chmod 777. Still no luck. I get the "Cannot open error". I
should also mention the when I index via the file system it will index
the site (a huge 293 files).
The funny thing is that when I index via index_form.html, the file
permissions are 755 and owned by "nobody". The search.cgi will
procced and find nothing because it index 0 pages (because we use the
https:).
As mentioned in previous posts, after indexing via the filesystem, I
manually chown the files to nobody, and chmod them all to 755 (even
tried 777) but still no luck.
> BTW, depending on how you use PHP it might make sense to index the
> filesystem and "comment out" the PHP code with the snippets configured
> in
> $IGNORE_TEXT_START and $IGNORE_TEXT_END in conf.pl.
I would prefer to index via the file system. The dynamic PHP content is
not critical to search. I have "commented out" potentially sensitive
information and added entries to conf/no_index.txt. I was only looking
at doing this via cgi because of the problems I was having with opening
the data/inv_index file.
I have spoken to the techos who are have assured me that the path
exists on the server. I have provided them the details of the script
and error and they have promised to investigate.
I can't help but feel that it is a configuration error on my part as I
had perlfect working out of the box and moved most of my pages over to
php. I cannot see how changing to php would cause any errors. But to
be safe/cautions I deleted the Perlfect install and did a clean
install. Still no luck.
If you can think of anything that might help or need more info please
let me know. I will continue to investigate my end.
Thanks again for your help.
Michael.
--
|
|