|
|
[Perlfect-search] Can Perlfect support Chinese/Big5 ??
Daniel Naber dnaber@mini.gt.owl.de
Thu, 14 Sep 2000 23:37:15 +0200
On Don, 14 Sep 2000, you wrote:
> Yes,the Chinese/Big5 charatcter set uses two bytes ( 16 bits )to encode.
> It's encoding range from A140 to F9FE and 8180 to FEA0 , total 23,940
> words.
You can try to remove this line in both indexer.pl and search.pl:
$buffer =~ tr/a-zA-Z0-9_/ /cs;
Even if that works, it's no good solution, but I have no idea how to test
pages with these characters, my browser doesn't even show them (but it
correctly shows pages with charset=gb2312).
Regards
Daniel
|
|