Perlfect Solutions

[Perlfect-search] Maximum number of files that can be indexed

Daniel Naber
Mon, 15 Apr 2002 20:20:17 +0200
On Monday 15 April 2002 18:33, John wrote:

> Daniel, what kind of performance kan one expect when indexing 75000
> documents and using floats in the inverse index. the index must grow
> really huge?!

I don't understand what you mean by "using floats". Do you mean the patch 
because of the 65,000 files patch? That's not floats, but 32(?)bit instead 
of 16bit integers.

I just did some tests with version 3.30:

file system indexing, $LOW_MEMORY_INDEX = 0:
Crawler finished: indexed 3014 files, 1555177 terms (54503 different terms)
Time: 3min 16sec

the same with $LOW_MEMORY_INDEX = 1:
Time: 7min 13sec

Index (=size of all files in the data directory): 7MB

My system: 900Mhz Athlon, 256 MB RAM, Perl 5.6