From perlfect-search@perlfect.com Tue Jul 1 10:18:24 2003 From: perlfect-search@perlfect.com (Peter Vereshagin) Date: Tue, 1 Jul 2003 15:18:24 +0500 (SAMST) Subject: [Perlfect-search] Russian patch Message-ID: <20030701101824.AF3681A78@least.beast> Hello, I adapted Perlfect for Russian (or seem to adapt). Changes affected several regexps during security checks and URL encodings too. I did it on version 3.30 since I saw no "fuzzy" patch for latest 3.31. I am ready to send my changed version to anybody but don't publish it on a web since not quite sure of reliability and security. I tested it on a ~1Gb of documents (doc.xls,pdf,html,ppt, etc.) and was satisfied. Thanks folks. ps. I needed no LWP "crawler" part, so it can be a lack attained. From perlfect-search@perlfect.com Tue Jul 8 21:26:16 2003 From: perlfect-search@perlfect.com (Rusty Wilson) Date: Tue, 8 Jul 2003 14:26:16 -0700 (PDT) Subject: [Perlfect-search] setting $STEMCHARS for most accurate search results Message-ID: <20030708212616.58652.qmail@web11808.mail.yahoo.com> I have read the faq and the comments within conf.pl, but I'm still a little confused by what $STEMCHARS is/does. Would someone be willing to explain in a little more detail exatly what the $STEMCHARS variable does, and what I should set that variable to if search accuracy is my primary concern. Thank you! Rusty From perlfect-search@perlfect.com Wed Jul 9 00:06:36 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Wed, 9 Jul 2003 02:06:36 +0200 Subject: [Perlfect-search] setting $STEMCHARS for most accurate search results In-Reply-To: <20030708212616.58652.qmail@web11808.mail.yahoo.com> References: <20030708212616.58652.qmail@web11808.mail.yahoo.com> Message-ID: <200307090206.36963@danielnaber.de> On Tuesday 08 July 2003 23:26, Rusty Wilson wrote: > I have read the faq and the comments within conf.pl, but I'm still a > little confused by what $STEMCHARS is/does. Each word is cut off after $STEMCHARS. So if $STEMCHARS = 3, "house" will be indexed and searched as "hou" (doesn't make much sense, so 3 is much too low). $STEMCHARS = 0 is the special case that doesn't cut off words at all, so leave it at 0 for best accuracy. Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Mon Jul 28 13:49:37 2003 From: perlfect-search@perlfect.com (Marc Borbely) Date: Mon, 28 Jul 2003 09:49:37 -0400 Subject: [Perlfect-search] Can Search Results Be Sorted by File Name? Message-ID: <001001c3550f$17995120$d4a1fea9@pavilion> This is a multi-part message in MIME format. ------=_NextPart_000_000D_01C354ED.8F05A500 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi. I use Perlfect Search to index my weekly newsletter = (http://www.thecornerforum.org). I'm wondering if there's any way to sort the results of searches by = filename, instead of by the regular ranking system. For my purposes, I'd rather display the results in chronological order = (the filenames are ordered chronologically, e.g. = http://www.thecornerforum.org/data/0010010.htm precedes = http://www.thecornerforum.org/data/0033000.htm) than by # of times the = terms appear in the files, or where in the file they appear. Is this possible? (I don't display the "dates" of the files on the search page because I = sometimes have to reload the files -- and then the files' dates are no = longer the original dates of the newsletter's publication.) Thank you very much. - Marc Borbely Editor, The Corner Forum. ------=_NextPart_000_000D_01C354ED.8F05A500 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi.
I use Perlfect Search to index my weekly newsletter (http://www.thecornerforum.org)= .
I'm wondering if there's any way to sort the results of searches by = filename, instead of by the regular ranking system.
For my purposes, I'd rather display the results in chronological = order (the=20 filenames are ordered chronologically, e.g. http://www.thecor= nerforum.org/data/0010010.htm=20 precedes http://www.thecor= nerforum.org/data/0033000.htm) than by # of times the terms appear in the files, or = where in=20 the file they appear.
Is this possible?
(I don't display the "dates" of the files on the search page = because I=20 sometimes have to reload the files -- and then the files' dates are no = longer=20 the original dates of the newsletter's publication.)
Thank you very much.
- Marc Borbely
Editor, The Corner Forum.
------=_NextPart_000_000D_01C354ED.8F05A500-- From perlfect-search@perlfect.com Mon Jul 28 14:25:16 2003 From: perlfect-search@perlfect.com (Daniel Naber) Date: Mon, 28 Jul 2003 16:25:16 +0200 Subject: [Perlfect-search] Can Search Results Be Sorted by File Name? In-Reply-To: <001001c3550f$17995120$d4a1fea9@pavilion> References: <001001c3550f$17995120$d4a1fea9@pavilion> Message-ID: <200307281625.16710@danielnaber.de> On Monday 28 July 2003 15:49, Marc Borbely wrote: > I'm wondering if there's any way to sort the results of searches by > filename, instead of by the regular ranking system. If you make a small change to the search.pl script, yes. Around line 447, you need to insert this line: @keys = sort {uc($docs_db{$a}) cmp uc($docs_db{$b})} (keys %answer); and remove these lines: if( defined($query->param('sort')) && $query->param('sort') eq 'title' ) { @keys = sort {uc($titles_db{$a}) cmp uc($titles_db{$b})} (keys %answer); } else { @keys = sort {$answer{$b} <=> $answer{$a}} (keys %answer); } Alternatively you could sort by title (your titles then need issue numbers, but that might be a good idea anyway). Just set a hidden field: Regards Daniel -- http://www.danielnaber.de From perlfect-search@perlfect.com Mon Jul 28 15:14:52 2003 From: perlfect-search@perlfect.com (Marc Borbely) Date: Mon, 28 Jul 2003 11:14:52 -0400 Subject: [Perlfect-search] Can Search Results Be Sorted by File Name? References: <001001c3550f$17995120$d4a1fea9@pavilion> <200307281625.16710@danielnaber.de> Message-ID: <009201c3551b$00fd8b00$d4a1fea9@pavilion> Thank you so much. I first tried sorting by title (with , as you suggested), but then realized that even though I have the dates in the titles, I have them spelled out, and therefore the April titles pop up first -- so that didn't help. Next, I deleted and added the lines as you suggested (and switched $a and $b, since it's even better if I can get them in reverse chronological order), and it works perfectly. Again, thanks for your help. - Marc Borbely ----- Original Message ----- From: "Daniel Naber" To: ; "Marc Borbely" Sent: Monday, July 28, 2003 10:25 AM Subject: Re: [Perlfect-search] Can Search Results Be Sorted by File Name? > On Monday 28 July 2003 15:49, Marc Borbely wrote: > > > I'm wondering if there's any way to sort the results of searches by > > filename, instead of by the regular ranking system. > > If you make a small change to the search.pl script, yes. Around line 447, you need > to insert this line: > > @keys = sort {uc($docs_db{$a}) cmp uc($docs_db{$b})} (keys %answer); > > and remove these lines: > > if( defined($query->param('sort')) && $query->param('sort') eq 'title' ) { > @keys = sort {uc($titles_db{$a}) cmp uc($titles_db{$b})} (keys %answer); > } else { > @keys = sort {$answer{$b} <=> $answer{$a}} (keys %answer); > } > > Alternatively you could sort by title (your titles then need issue numbers, but > that might be a good idea anyway). Just set a hidden field: > > > Regards > Daniel > > -- > http://www.danielnaber.de > _______________________________________________ > perlfect-search mailing list > perlfect-search@perlfect.com > To unsubscribe, set other personal options or view the list archives please visit: > http://perlfect.com/mailman/listinfo/perlfect-search >  >