Perlfect Solutions
 

AW: [Perlfect-search] Maximum number of files that can be indexed

John jotov@start.no
Fri, 12 Apr 2002 18:20:28 +0200 (CEST)
Quoting Mayer Richard <Richard.Mayer@micronas.com>:

> hmm, sorry, but I think I don�t really get this. I don�t understand
> any
> PERL, so could you please point me a little bit more into the right
> direction.

I am sorry. I didn't want to get too much into detail. Besides, I'm not the author of the scripts, and no Perl expert, so I might be wrong.

Anyway, when searching, the point is that you want to get the list of documents that contains a certain term. 

Term => List of documents

A convenient way to do this is to look up the list using in a 'hash'. But you need to store the list of docs as one 'chunk' of data. So you use perl's 'pack' function in order to achieve this. Using 'unpack' you will get the list in return. But these functions need a 'template' to know how the list will be packed , and how much space to use. Is it a list of bytes, long values, floats etc? For now, 16 bits are used for each document, which give a you the 65536 max. But by modifying the template, you can allocate more space to each item in the list, and thus have a larger number of documents.

for reference: http://www.perldoc.com/perl5.6.1/pod/func/pack.html

Give me some time, and I'll see if I can figure out which changes are needed. (But I can't guarantee anything :)

John




------------------------------------------------------------
F� din egen @start.no-adresse gratis p� http://www.start.no/