From perlfect-search@perlfect.com Thu Aug 7 19:59:27 2003 From: perlfect-search@perlfect.com (gape) Date: Thu, 7 Aug 2003 21:59:27 +0200 Subject: [Perlfect-search] too long indexing process Message-ID: <002c01c35d1e$6bba4f40$5a0064c0@gapelt> This is a multi-part message in MIME format. ------=_NextPart_000_0029_01C35D2F.2BD792A0 Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable i have a little problem i have perlfect 3.30 installed for a year now ... at first i was using = it for indexind thru disk (document_root & stuff) then i changed the conf to index thru web (base_url & stuff), becouse of = a forum ... i was indexing thru web as i remember it took perlfect about 30 mins to index the site afcourse the site has grown ... it got a php script geeklog where i run = a 'news paper'. it has abot 1000 articles, some with comments. forum has 50000 posts when the index got thru (1 year ago) i gave the directive to = TheMasterOfTheWebMachine, to put perlfect in cron. he didn't ...=20 so ... now i wanted to finally, really put together a search engine, = that will tell a searcher the ansver to the question.=20 so i gave another directive to TheMasterOfTheWebMachine to index my site = thru ssh and to look for processor times and so on ...=20 so he did processor 100% all the time, the sites on machine were working, but ... = 100% ??? sites were slower afcourse ... but that is all ... the machine is not = experimental, it runs a lot of commercial sites. after a few hours TheMasterOfTheWebMachine called me, to tell me that he = will stop the script, couse it ate 250+ of ram and all of the procesor = ...=20 i have $HTTP_MAX_PAGES =3D 10000; the first q is if i have $HTTP_START_URL enabled (uncomented), should i coment out = $DOCUMENT_ROOT - now it wasn't. the second ... is it normal for indexer to run so long? i mean ... site has well over 10.000 pages (html, php, pl ...) ... the = script should stop when it reached that limit ??? or what ... tnx ... With Love and Light gape www.gape.org ------=_NextPart_000_0029_01C35D2F.2BD792A0 Content-Type: text/html; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable
i have a little problem
i have perlfect 3.30 installed for a = year now ...=20 at first i was using it for indexind thru disk (document_root &=20 stuff)
then i changed the conf to index thru = web (base_url=20 & stuff), becouse of a forum ...
i was indexing thru web
as i remember it took perlfect about 30 = mins to=20 index the site
afcourse the site has grown ... it = got a php=20 script geeklog where i run a 'news paper'.
it has abot 1000 articles, some with=20 comments.
forum has 50000 posts
 
when the index got thru (1 year ago) i = gave the=20 directive to TheMasterOfTheWebMachine, to put perlfect in = cron.
he didn't ...
so ... now i wanted to finally, really = put together=20 a search engine, that will tell a searcher the ansver to the question.=20
 
so i gave another directive to=20 TheMasterOfTheWebMachine to index my site thru ssh and to look for = processor=20 times and so on ...
 
so he did
 
processor 100% all the time, the sites = on machine=20 were working, but ... 100% ???
sites were slower afcourse ... but that = is all ...=20 the machine is not experimental, it runs a lot of commercial = sites.
after a few hours = TheMasterOfTheWebMachine called=20 me, to tell me that he will stop the script, couse it ate 250+ of ram = and all of=20 the procesor ...
 
i have
$HTTP_MAX_PAGES =3D 10000;
 
the first q is
if i have $HTTP_START_URL enabled = (uncomented),=20 should i coment out $DOCUMENT_ROOT - now it wasn't.
 
the second ... is it normal for indexer = to run so=20 long?
 
i mean ... site has well over 10.000 = pages (html,=20 php, pl ...) ... the script should stop when it reached that limit=20 ???
 
or what ...
 
 
tnx ...

With Love and Light
gape
www.gape.org
 
 
------=_NextPart_000_0029_01C35D2F.2BD792A0-- From perlfect-search@perlfect.com Thu Aug 7 22:36:29 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Fri, 8 Aug 2003 00:36:29 +0200 Subject: [Perlfect-search] too long indexing process In-Reply-To: <002c01c35d1e$6bba4f40$5a0064c0@gapelt> References: <002c01c35d1e$6bba4f40$5a0064c0@gapelt> Message-ID: <200308080036.29784@danielnaber.de> On Thursday 07 August 2003 21:59, gape wrote: > if i have $HTTP_START_URL enabled (uncomented), should i coment out > $DOCUMENT_ROOT - now it wasn't. No, that shouldn't matter. > the second ... is it normal for indexer to run so long? Yes, with many pages it can take very long. > i mean ... site has well over 10.000 pages (html, php, pl ...) ... the > script should stop when it reached that limit ??? It will stop after having fetched $HTTP_MAX_PAGES pages from the web. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Fri Aug 8 10:42:31 2003 From: perlfect-search@perlfect.com (gape) Date: Fri, 8 Aug 2003 12:42:31 +0200 Subject: [Perlfect-search] too long indexing process References: <002c01c35d1e$6bba4f40$5a0064c0@gapelt> <200308080036.29784@danielnaber.de> Message-ID: <003a01c35d99$c64dc1f0$5a0064c0@gapelt> > > the second ... is it normal for indexer to run so long? > > Yes, with many pages it can take very long. > ok ... how long does it take to index 10.000 pages then ... aproximately ? tnx gape From perlfect-search@perlfect.com Fri Aug 8 10:56:00 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Fri, 8 Aug 2003 12:56:00 +0200 Subject: [Perlfect-search] too long indexing process In-Reply-To: <003a01c35d99$c64dc1f0$5a0064c0@gapelt> References: <002c01c35d1e$6bba4f40$5a0064c0@gapelt> <200308080036.29784@danielnaber.de> <003a01c35d99$c64dc1f0$5a0064c0@gapelt> Message-ID: <200308081256.02415@danielnaber.de> On Friday 08 August 2003 12:42, gape wrote: > ok ... how long does it take to index 10.000 pages then ... aproximately > ? This obviously depends on the files' size, server speed, server load, server memory and your settings -> somewhere between 5 minutes and a few hours. -- http://www.danielnaber.de From perlfect-search@perlfect.com Wed Aug 13 01:03:22 2003 From: perlfect-search@perlfect.com (Lucas Young) Date: Wed, 13 Aug 2003 13:03:22 +1200 Subject: [Perlfect-search] Perlfect Search with pdftohtml? Message-ID: Hi My ISP wont allow the installation of pdftotext but will allow me to install pdftohtml, which is essentially the same except it outputs a well-formed html version of a pdf document. Is there a way to reconfigure your script to treat the output from pdftohtml (which with the -stdout switch outputs to stdout) as html instead of text and parse it appropriately? many thanks Lucas Young From perlfect-search@perlfect.com Wed Aug 13 07:47:30 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Wed, 13 Aug 2003 09:47:30 +0200 Subject: [Perlfect-search] Perlfect Search with pdftohtml? In-Reply-To: References: Message-ID: <200308130947.32038@danielnaber.de> On Wednesday 13 August 2003 03:03, Lucas Young wrote: > Is there a way to reconfigure your script to > treat the output from pdftohtml (which with the -stdout switch outputs > to stdout) as html instead of text and parse it appropriately? It should be possible to use it directly, without modifications. Just set pdftohtml instead of pdftotext in conf.pl. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Tue Aug 19 09:58:42 2003 From: perlfect-search@perlfect.com (Rochus Wolff) Date: Tue, 19 Aug 2003 11:58:42 +0200 Subject: [Perlfect-search] inv_index: No such file or directory Message-ID: <3F41F4D2.7010403@zedat.fu-berlin.de> Dear all, I have installed Perlfect Search by the book (or, rather, readme); indexer.pl runs fine and I can call search.cgi (has to be cgi on my server) via telnet on my server. Works well, gives results. When I try to call it via http, however, i get an error message: Software error: Cannot open /home/q/querelle/public_html/suche/data/inv_index: No such file or directory at /server/www-adm/doc/www.querelles-net.de/suche/search.cgi line 76. I know this kind of message has showed up with some people already, but their solutions don't seem to work in my case. It seems weird there is no problem calling it on the server and not being able to call it from the browser. Other cgi-scripts in the same directory work fine, though. All the files in the data/ directory have 755 reading/writing permissions, so it can't be that. What else could it be? Thanks for any help. Yours, Rochus From perlfect-search@perlfect.com Tue Aug 19 12:05:38 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Tue, 19 Aug 2003 14:05:38 +0200 Subject: [Perlfect-search] inv_index: No such file or directory In-Reply-To: <3F41F4D2.7010403@zedat.fu-berlin.de> References: <3F41F4D2.7010403@zedat.fu-berlin.de> Message-ID: <200308191405.39864@danielnaber.de> On Tuesday 19 August 2003 11:58, Rochus Wolff wrote: > Cannot open /home/q/querelle/public_html/suche/data/inv_index: > No such file or directory at > /server/www-adm/doc/www.querelles-net.de/suche/search.cgi line 76. You should probably use this "/server/..." path for the $INSTALL_DIR seeting, not "/home/...". If that doesn't help, you should ask your web hosting support. Sometimes servers are configured so that there are different paths for shell and CGI. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Tue Aug 19 20:19:53 2003 From: perlfect-search@perlfect.com (Barker, Robert R.) Date: Tue, 19 Aug 2003 15:19:53 -0500 Subject: [Perlfect-search] Indexing question Message-ID: <78E79323135DAA4ABD7FBAC1F493A718A5E0B4@fw1-ex01.c-b.net> This is a multi-part message in MIME format. ------_=_NextPart_001_01C3668F.400358C1 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I've just installed perlfect on my home machine for testing prior to = installing it at work. The install went fine as did the indexing (done = locally - not via http) but I must have set it up wrong or not installed = a module or something though. When I search, I've found that I've = managed to index all the html and asp code on the pages instead of just = the content. I've looked at the perlfect site and some of the other = example sites and they don't seem to have that problem so I'm sure I've = set something up wrong. =20 =20 I tried searching the list archives but didn't see anything that fit = either. =20 =20 BTW, the home box is a Win2kServer, IIS5, Active Perl, latest Perlfect = build, DB_File-1.806 & the HTML Parser module. I couldn't get the = tagset module installed as I kept getting an error. =20 Thanks for your help in advance. =20 Bob barkerjr@c-b.com =20 ------_=_NextPart_001_01C3668F.400358C1 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
I've just installed = perlfect on=20 my home machine for testing prior to installing it at work.  The = install=20 went fine as did the indexing (done locally - not via = http) but I must=20 have set it up wrong or not installed a module or something = though.  When I=20 search, I've found that I've managed to index all the html and asp code = on the=20 pages instead of just the content.  I've looked at the perlfect = site and=20 some of the other example sites and they don't seem to have that problem = so I'm=20 sure I've set something up wrong. 
 
I tried searching = the list=20 archives but didn't see anything that fit either.  =
 
BTW, the home box = is a=20 Win2kServer, IIS5, Active Perl, latest Perlfect build, DB_File-1.806 = & the=20 HTML Parser module.  I couldn't get the tagset module installed as = I kept=20 getting an error.
 
Thanks for your = help in=20 advance.
 
Bob
barkerjr@c-b.com
=
 
------_=_NextPart_001_01C3668F.400358C1-- From perlfect-search@perlfect.com Tue Aug 19 20:58:20 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Tue, 19 Aug 2003 22:58:20 +0200 Subject: [Perlfect-search] Indexing question In-Reply-To: <78E79323135DAA4ABD7FBAC1F493A718A5E0B4@fw1-ex01.c-b.net> References: <78E79323135DAA4ABD7FBAC1F493A718A5E0B4@fw1-ex01.c-b.net> Message-ID: <200308192258.20506@danielnaber.de> On Tuesday 19 August 2003 22:19, Barker, Robert R. wrote: > When I search, I've found that I've managed to index all the html and > asp code on the pages instead of just the content. When you make use of ASP, you have to index via http. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Fri Aug 22 18:40:21 2003 From: perlfect-search@perlfect.com (Steve Lawrence) Date: Fri, 22 Aug 2003 12:40:21 -0600 Subject: [Perlfect-search] Perlfect and PDF files Message-ID: <001f01c368dc$d7955dd0$6401a8c0@DAD> I can index pdf files, but when I get search results, the abstracts are whacky. I can run pdftotext manually and dump the results to a text file, and that’s good, but when using the search/results, it's bad. Here is an example from the results page: 1 and 3 are screwed, but number two, which is a plain pdf with only text is good. Help Please! 1. 5465.PDF %PDF-1.3 %âãÏÓ 115 0 obj endobj xref 115 15 0000000016 00000 n 0000000651 00000 n 0000001503 00000 n 0000001661 00000 n 0000001861 00000 n 0000002038... URL: http://ca.nexiaweb.com/pdf/5465.PDF Score: 100% Date: 2003-08-22 Size: 606 kB 2. adobe.pdf This is an Adobe PDF document. If you search for the word Adobe, it should come up in the search results. URL: http://ca.nexiaweb.com/pdf/adobe.pdf Score: 12% Date: 2003-08-22 Size: 28 kB 3. 5444.PDF %PDF-1.2 %âãÏÓ 10 0 obj endobj xref 10 17 0000000016 00000 n 0000000686 00000 n 0000001076 00000 n 0000001229 00000 n 0000001436 00000 n 0000001616... URL: http://ca.nexiaweb.com/pdf/5444.PDF Score: 12% Date: 2003-08-22 Size: 14 kB From perlfect-search@perlfect.com Fri Aug 22 19:24:29 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Fri, 22 Aug 2003 21:24:29 +0200 Subject: [Perlfect-search] Perlfect and PDF files In-Reply-To: <001f01c368dc$d7955dd0$6401a8c0@DAD> References: <001f01c368dc$d7955dd0$6401a8c0@DAD> Message-ID: <200308222124.30325@danielnaber.de> On Friday 22 August 2003 20:40, Steve Lawrence wrote: > URL: http://ca.nexiaweb.com/pdf/5465.PDF Score: 100% Date: > 2003-08-22 Size: 606 kB The problem is that the file ends in "PDF", not "pdf". Either rename the file or add an extra line to %EXT_FILTER in conf.pl (just like the one for "pdf", only with "PDF"). Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Fri Aug 22 20:32:42 2003 From: perlfect-search@perlfect.com (Steve Lawrence) Date: Fri, 22 Aug 2003 14:32:42 -0600 Subject: [Perlfect-search] Perlfect and PDF files In-Reply-To: <200308222124.30325@danielnaber.de> Message-ID: <002301c368ec$89cd75a0$6401a8c0@DAD> I have already done that. If you think about it, the file would not show up in a search if it was not indexed. The problem is not that the file was not indexed, but that the abstract is messed up. I included the output of the search results, including an example of the 1 good abstract and the 2 bad abstracts. -----Original Message----- From: perlfect-search-admin@perlfect.com [mailto:perlfect-search-admin@perlfect.com] On Behalf Of Daniel Naber Sent: Friday, August 22, 2003 1:24 PM To: perlfect-search@perlfect.com Subject: Re: [Perlfect-search] Perlfect and PDF files On Friday 22 August 2003 20:40, Steve Lawrence wrote: > URL: http://ca.nexiaweb.com/pdf/5465.PDF Score: 100% Date: > 2003-08-22 Size: 606 kB The problem is that the file ends in "PDF", not "pdf". Either rename the file or add an extra line to %EXT_FILTER in conf.pl (just like the one for "pdf", only with "PDF"). Regards Daniel -- http://www.danielnaber.de _______________________________________________ perlfect-search mailing list perlfect-search@perlfect.com To unsubscribe, set other personal options or view the list archives please visit: http://perlfect.com/mailman/listinfo/perlfect-search  From perlfect-search@perlfect.com Fri Aug 22 20:42:36 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Fri, 22 Aug 2003 22:42:36 +0200 Subject: [Perlfect-search] Perlfect and PDF files In-Reply-To: <002301c368ec$89cd75a0$6401a8c0@DAD> References: <002301c368ec$89cd75a0$6401a8c0@DAD> Message-ID: <200308222242.37228@danielnaber.de> On Friday 22 August 2003 22:32, Steve Lawrence wrote: > I have already done that. If you think about it, the file would not show > up in a search if it was not indexed. As I said: add "PDF" to %EXT_FILTER (not only to @EXT) and re-index. That fixes the problem. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Fri Aug 22 23:24:17 2003 From: perlfect-search@perlfect.com (Steve Lawrence) Date: Fri, 22 Aug 2003 17:24:17 -0600 Subject: [Perlfect-search] Perlfect and PDF files In-Reply-To: <200308222242.37228@danielnaber.de> Message-ID: <002e01c36904$82238f20$6401a8c0@DAD> Ok, I feel like an idiot. I caught the case sensitive thing on the @EXT but missed it on %EXT_FILTER. Sorry for sounding like a dumbass. Thanks for your help! -----Original Message----- From: perlfect-search-admin@perlfect.com [mailto:perlfect-search-admin@perlfect.com] On Behalf Of Daniel Naber Sent: Friday, August 22, 2003 2:43 PM To: perlfect-search@perlfect.com Subject: Re: [Perlfect-search] Perlfect and PDF files On Friday 22 August 2003 22:32, Steve Lawrence wrote: > I have already done that. If you think about it, the file would not show > up in a search if it was not indexed. As I said: add "PDF" to %EXT_FILTER (not only to @EXT) and re-index. That fixes the problem. Regards Daniel -- http://www.danielnaber.de _______________________________________________ perlfect-search mailing list perlfect-search@perlfect.com To unsubscribe, set other personal options or view the list archives please visit: http://perlfect.com/mailman/listinfo/perlfect-search  From perlfect-search@perlfect.com Mon Aug 25 21:36:23 2003 From: perlfect-search@perlfect.com (Katie Hague) Date: Mon, 25 Aug 2003 22:36:23 +0100 Subject: [Perlfect-search] Premature end of script headers Message-ID: <012401c36b50$f2be0140$0f00a8c0@mshome.net> This is a multi-part message in MIME format. ------=_NextPart_000_0121_01C36B59.4FD498A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I have installed perlfect search using SSH and successfully (I believe) = indexed the site. The readme file then suggests running the search.pl = to test it (should return with no matches). I just get '500 internal = server error'. When I run search.pl in SSH I get the result No Matches, as I was = expecting. My error log says: [Mon Aug 25 22:32:57 2003] [error] (2)No such file or directory: exec of = /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl = failed [Mon Aug 25 22:32:57 2003] [error] [client 81.96.231.1] Premature end of = script headers: = /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl Can anyone help? Many thanks! Katie ------=_NextPart_000_0121_01C36B59.4FD498A0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
I have installed = perlfect search=20 using SSH and successfully (I believe) indexed the site.  The = readme file=20 then suggests running the search.pl to test it (should return with no = matches).=20 I just get '500 internal server error'.
 
When I run search.pl = in SSH I get=20 the result No Matches, as I was expecting.
 
My error log = says:
 

[Mon Aug 25 22:32:57 2003] [error] (2)No such file or directory: exec = of=20 /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl=20 failed

[Mon Aug 25 22:32:57 2003] [error] [client 81.96.231.1] Premature end = of=20 script headers:=20 /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl

 

Can anyone help?

 

Many thanks!

Katie

------=_NextPart_000_0121_01C36B59.4FD498A0-- From perlfect-search@perlfect.com Mon Aug 25 22:01:21 2003 From: perlfect-search@perlfect.com (Katie Hague) Date: Mon, 25 Aug 2003 23:01:21 +0100 Subject: [Perlfect-search] Premature end of script headers References: <012401c36b50$f2be0140$0f00a8c0@mshome.net> Message-ID: <014001c36b54$6b5281a0$0f00a8c0@mshome.net> This is a multi-part message in MIME format. ------=_NextPart_000_013D_01C36B5C.CCBB5C20 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable By the way.. all of my permissions are set to 755 apart from indexer.pl = which is set to 700. I have tried searching for various keywords via SSH and it WORKS really = well! Still nothing when trying via the web though... I would really appreciate any help! ----- Original Message -----=20 From: Katie Hague=20 To: perlfect-search@perlfect.com=20 Sent: Monday, August 25, 2003 10:36 PM Subject: [Perlfect-search] Premature end of script headers I have installed perlfect search using SSH and successfully (I = believe) indexed the site. The readme file then suggests running the = search.pl to test it (should return with no matches). I just get '500 = internal server error'. When I run search.pl in SSH I get the result No Matches, as I was = expecting. My error log says: [Mon Aug 25 22:32:57 2003] [error] (2)No such file or directory: exec = of = /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl = failed [Mon Aug 25 22:32:57 2003] [error] [client 81.96.231.1] Premature end = of script headers: = /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl Can anyone help? Many thanks! Katie ------=_NextPart_000_013D_01C36B5C.CCBB5C20 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
By the way.. all of = my permissions=20 are set to 755 apart from indexer.pl which is set to 700.
 
I have tried = searching for various=20 keywords via SSH and it WORKS really well!
 
Still nothing when = trying via the=20 web though...
 
 
I would really = appreciate any=20 help!
----- Original Message -----
From:=20 Katie=20 Hague
Sent: Monday, August 25, 2003 = 10:36=20 PM
Subject: [Perlfect-search] = Premature end=20 of script headers

I have installed = perlfect search=20 using SSH and successfully (I believe) indexed the site.  The = readme file=20 then suggests running the search.pl to test it (should return with no=20 matches). I just get '500 internal server error'.
 
When I run = search.pl in SSH I get=20 the result No Matches, as I was expecting.
 
My error log = says:
 

[Mon Aug 25 22:32:57 2003] [error] (2)No such file or directory: = exec of=20 = /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl=20 failed

[Mon Aug 25 22:32:57 2003] [error] [client 81.96.231.1] Premature = end of=20 script headers:=20 = /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl

 

Can anyone help?

 

Many thanks!

Katie

------=_NextPart_000_013D_01C36B5C.CCBB5C20-- From perlfect-search@perlfect.com Mon Aug 25 22:43:48 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Tue, 26 Aug 2003 00:43:48 +0200 Subject: [Perlfect-search] Premature end of script headers In-Reply-To: <012401c36b50$f2be0140$0f00a8c0@mshome.net> References: <012401c36b50$f2be0140$0f00a8c0@mshome.net> Message-ID: <200308260043.49396@danielnaber.de> On Monday 25 August 2003 23:36, Katie Hague wrote: > [Mon Aug 25 22:32:57 2003] [error] (2)No such file or directory: exec of > /home/4563/phones/www.lovemyphone.com/cgi-bin/perlfect/search/search.pl > failed Is that path correct, i.e. does that file exist? If so, maybe the path to Perl in the very first line of the script is incorrect. You can find the path to Perl with "which perl" via ssh. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Tue Aug 26 09:25:59 2003 From: perlfect-search@perlfect.com (Katie Hague) Date: Tue, 26 Aug 2003 10:25:59 +0100 Subject: [Perlfect-search] Premature end of script headers References: <012401c36b50$f2be0140$0f00a8c0@mshome.net> <200308260043.49396@danielnaber.de> Message-ID: <017101c36bb4$18a8b220$0f00a8c0@mshome.net> Daniel, Thank you for your help - everything appears to be working now! I typed "which perl" through SSH and found that that path to perl was different to what my webhost had told me! (no wonder things weren't working!). I also found I had to remove the square brackets from around the path to pearl in search.pl Thanks again, Katie From perlfect-search@perlfect.com Tue Aug 26 12:55:36 2003 From: perlfect-search@perlfect.com (Katie Hague) Date: Tue, 26 Aug 2003 13:55:36 +0100 Subject: [Perlfect-search] Indexing by HTTP - lots of pages! Message-ID: <01df01c36bd1$58334dc0$0f00a8c0@mshome.net> This is a multi-part message in MIME format. ------=_NextPart_000_01DC_01C36BD9.B9D62400 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I have literally 1000s of dynamically created PHP pages which I need to = index via PHP. Is there a max number it will index, eg before it times = out? Or can I set the max number pages to 10,000 and just let it run and = run?! Thanks, Katie ------=_NextPart_000_01DC_01C36BD9.B9D62400 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
I have literally = 1000s of=20 dynamically created PHP pages which I need to index via PHP. Is there a = max=20 number it will index, eg before it times out?
 
Or can I set the max = number pages=20 to 10,000 and just let it run and run?!
 
Thanks,
 
Katie
------=_NextPart_000_01DC_01C36BD9.B9D62400-- From perlfect-search@perlfect.com Tue Aug 26 13:17:26 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Tue, 26 Aug 2003 15:17:26 +0200 Subject: [Perlfect-search] Indexing by HTTP - lots of pages! In-Reply-To: <01df01c36bd1$58334dc0$0f00a8c0@mshome.net> References: <01df01c36bd1$58334dc0$0f00a8c0@mshome.net> Message-ID: <200308261517.27510@danielnaber.de> On Tuesday 26 August 2003 14:55, Katie Hague wrote: > I have literally 1000s of dynamically created PHP pages which I need to > index via PHP. Is there a max number it will index, eg before it times > out? The only relevant number is $HTTP_MAX_PAGES, which gives the maximum number of http request that Perlfect Search will make. This can be more than the number of pages, so set it to a number that's higher than the number of pages to index. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Tue Aug 26 13:28:29 2003 From: perlfect-search@perlfect.com (Katie Hague) Date: Tue, 26 Aug 2003 14:28:29 +0100 Subject: [Perlfect-search] Indexing by HTTP - lots of pages! References: <01df01c36bd1$58334dc0$0f00a8c0@mshome.net> <200308261517.27510@danielnaber.de> Message-ID: <01f401c36bd5$efd49b80$0f00a8c0@mshome.net> thanks! ----- Original Message ----- From: "Daniel Naber" To: Sent: Tuesday, August 26, 2003 2:17 PM Subject: Re: [Perlfect-search] Indexing by HTTP - lots of pages! > On Tuesday 26 August 2003 14:55, Katie Hague wrote: > > > I have literally 1000s of dynamically created PHP pages which I need to > > index via PHP. Is there a max number it will index, eg before it times > > out? > > The only relevant number is $HTTP_MAX_PAGES, which gives the maximum number > of http request that Perlfect Search will make. This can be more than the > number of pages, so set it to a number that's higher than the number of > pages to index. > > Regards > Daniel > > -- > http://www.danielnaber.de > _______________________________________________ > perlfect-search mailing list > perlfect-search@perlfect.com > To unsubscribe, set other personal options or view the list archives please visit: > http://perlfect.com/mailman/listinfo/perlfect-search >  From perlfect-search@perlfect.com Fri Aug 29 08:09:12 2003 From: perlfect-search@perlfect.com (Maninder, Singh) Date: Fri, 29 Aug 2003 13:39:12 +0530 Subject: [Perlfect-search] RE: perlfect-search digest, Vol 1 #481 - 4 msgs Message-ID: Hi Daniel, I have a question about the search engine - Using the exclude hidden field is it possible to exclude certain files distributed over 2-3 directories? Like for example, I would like to exclude a.html (which is in /a/) b.html (which is in /b/) and c.html (which is in /c/) >From the description below it looks like all of them should be in one common directory to get excluded. I don't want to place them in no_index.txt cos they need to get indexed for another template for which I don't want to create another search module. Could you please help me with this. Thanks. exclude: If you want to exclude the files in certain paths, use this option. Example: /old_stuff/. This is evaluated after include, so you can restrict the set of files with include and then further restrict it with this option. You can also set a regular expression. Setting this to "" will not exclude any files. Note: Do not use this to protect private files, as anybody can change this option. To protect private/secret files, use conf/no_index.txt instead and re-index your files. From perlfect-search@perlfect.com Fri Aug 29 18:16:40 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Fri, 29 Aug 2003 20:16:40 +0200 Subject: [Perlfect-search] RE: perlfect-search digest, Vol 1 #481 - 4 msgs In-Reply-To: References: Message-ID: <200308292016.40119@danielnaber.de> On Friday 29 August 2003 10:09, Maninder, Singh wrote: > I think this should work: Regards Daniel -- http://www.danielnaber.de