From perlfect-search@perlfect.com Tue Jul 1 10:18:24 2003
From: perlfect-search@perlfect.com (Peter Vereshagin)
Date: Tue, 1 Jul 2003 15:18:24 +0500 (SAMST)
Subject: [Perlfect-search] Russian patch
Message-ID: <20030701101824.AF3681A78@least.beast>
Hello,
I adapted Perlfect for Russian (or seem to adapt). Changes affected several regexps during security checks and URL encodings too.
I did it on version 3.30 since I saw no "fuzzy" patch for latest 3.31.
I am ready to send my changed version to anybody but don't publish it on a web since not quite sure of reliability and security.
I tested it on a ~1Gb of documents (doc.xls,pdf,html,ppt, etc.) and was satisfied. Thanks folks.
ps. I needed no LWP "crawler" part, so it can be a lack attained.
From perlfect-search@perlfect.com Tue Jul 8 21:26:16 2003
From: perlfect-search@perlfect.com (Rusty Wilson)
Date: Tue, 8 Jul 2003 14:26:16 -0700 (PDT)
Subject: [Perlfect-search] setting $STEMCHARS for most accurate search results
Message-ID: <20030708212616.58652.qmail@web11808.mail.yahoo.com>
I have read the faq and the comments within conf.pl, but I'm still a little
confused by what $STEMCHARS is/does.
Would someone be willing to explain in a little more detail exatly what the
$STEMCHARS variable does, and what I should set that variable to if search
accuracy is my primary concern.
Thank you!
Rusty
From perlfect-search@perlfect.com Wed Jul 9 00:06:36 2003
From: perlfect-search@perlfect.com (Daniel Naber)
Date: Wed, 9 Jul 2003 02:06:36 +0200
Subject: [Perlfect-search] setting $STEMCHARS for most accurate search results
In-Reply-To: <20030708212616.58652.qmail@web11808.mail.yahoo.com>
References: <20030708212616.58652.qmail@web11808.mail.yahoo.com>
Message-ID: <200307090206.36963@danielnaber.de>
On Tuesday 08 July 2003 23:26, Rusty Wilson wrote:
> I have read the faq and the comments within conf.pl, but I'm still a
> little confused by what $STEMCHARS is/does.
Each word is cut off after $STEMCHARS. So if $STEMCHARS = 3, "house" will
be indexed and searched as "hou" (doesn't make much sense, so 3 is much
too low).
$STEMCHARS = 0 is the special case that doesn't cut off words at all, so
leave it at 0 for best accuracy.
Regards
Daniel
--
http://www.danielnaber.de
From perlfect-search@perlfect.com Mon Jul 28 13:49:37 2003
From: perlfect-search@perlfect.com (Marc Borbely)
Date: Mon, 28 Jul 2003 09:49:37 -0400
Subject: [Perlfect-search] Can Search Results Be Sorted by File Name?
Message-ID: <001001c3550f$17995120$d4a1fea9@pavilion>
This is a multi-part message in MIME format.
------=_NextPart_000_000D_01C354ED.8F05A500
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi.
I use Perlfect Search to index my weekly newsletter =
(http://www.thecornerforum.org).
I'm wondering if there's any way to sort the results of searches by =
filename, instead of by the regular ranking system.
For my purposes, I'd rather display the results in chronological order =
(the filenames are ordered chronologically, e.g. =
http://www.thecornerforum.org/data/0010010.htm precedes =
http://www.thecornerforum.org/data/0033000.htm) than by # of times the =
terms appear in the files, or where in the file they appear.
Is this possible?
(I don't display the "dates" of the files on the search page because I =
sometimes have to reload the files -- and then the files' dates are no =
longer the original dates of the newsletter's publication.)
Thank you very much.
- Marc Borbely
Editor, The Corner Forum.
------=_NextPart_000_000D_01C354ED.8F05A500
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
(I don't display the "dates" of the files on the search page =
because I=20
sometimes have to reload the files -- and then the files' dates are no =
longer=20
the original dates of the newsletter's publication.)
Thank you very much.
- Marc Borbely
Editor, The Corner Forum.
------=_NextPart_000_000D_01C354ED.8F05A500--
From perlfect-search@perlfect.com Mon Jul 28 14:25:16 2003
From: perlfect-search@perlfect.com (Daniel Naber)
Date: Mon, 28 Jul 2003 16:25:16 +0200
Subject: [Perlfect-search] Can Search Results Be Sorted by File Name?
In-Reply-To: <001001c3550f$17995120$d4a1fea9@pavilion>
References: <001001c3550f$17995120$d4a1fea9@pavilion>
Message-ID: <200307281625.16710@danielnaber.de>
On Monday 28 July 2003 15:49, Marc Borbely wrote:
> I'm wondering if there's any way to sort the results of searches by
> filename, instead of by the regular ranking system.
If you make a small change to the search.pl script, yes. Around line 447, you need
to insert this line:
@keys = sort {uc($docs_db{$a}) cmp uc($docs_db{$b})} (keys %answer);
and remove these lines:
if( defined($query->param('sort')) && $query->param('sort') eq 'title' ) {
@keys = sort {uc($titles_db{$a}) cmp uc($titles_db{$b})} (keys %answer);
} else {
@keys = sort {$answer{$b} <=> $answer{$a}} (keys %answer);
}
Alternatively you could sort by title (your titles then need issue numbers, but
that might be a good idea anyway). Just set a hidden field:
Regards
Daniel
--
http://www.danielnaber.de
From perlfect-search@perlfect.com Mon Jul 28 15:14:52 2003
From: perlfect-search@perlfect.com (Marc Borbely)
Date: Mon, 28 Jul 2003 11:14:52 -0400
Subject: [Perlfect-search] Can Search Results Be Sorted by File Name?
References: <001001c3550f$17995120$d4a1fea9@pavilion> <200307281625.16710@danielnaber.de>
Message-ID: <009201c3551b$00fd8b00$d4a1fea9@pavilion>
Thank you so much.
I first tried sorting by title (with , as you suggested), but then realized that even though I have
the dates in the titles, I have them spelled out, and therefore the April
titles pop up first -- so that didn't help.
Next, I deleted and added the lines as you suggested (and switched $a and
$b, since it's even better if I can get them in reverse chronological
order), and it works perfectly.
Again, thanks for your help.
- Marc Borbely
----- Original Message -----
From: "Daniel Naber"
To: ; "Marc Borbely"
Sent: Monday, July 28, 2003 10:25 AM
Subject: Re: [Perlfect-search] Can Search Results Be Sorted by File Name?
> On Monday 28 July 2003 15:49, Marc Borbely wrote:
>
> > I'm wondering if there's any way to sort the results of searches by
> > filename, instead of by the regular ranking system.
>
> If you make a small change to the search.pl script, yes. Around line 447,
you need
> to insert this line:
>
> @keys = sort {uc($docs_db{$a}) cmp uc($docs_db{$b})} (keys %answer);
>
> and remove these lines:
>
> if( defined($query->param('sort')) && $query->param('sort') eq
'title' ) {
> @keys = sort {uc($titles_db{$a}) cmp uc($titles_db{$b})} (keys
%answer);
> } else {
> @keys = sort {$answer{$b} <=> $answer{$a}} (keys %answer);
> }
>
> Alternatively you could sort by title (your titles then need issue
numbers, but
> that might be a good idea anyway). Just set a hidden field:
>
>
> Regards
> Daniel
>
> --
> http://www.danielnaber.de
> _______________________________________________
> perlfect-search mailing list
> perlfect-search@perlfect.com
> To unsubscribe, set other personal options or view the list archives
please visit:
> http://perlfect.com/mailman/listinfo/perlfect-search
>
>