[Perlfect-search] pdf indexing quits during perfect search

Daniel Naber
Thu, 13 Dec 2001 02:18:47 +0100
> We have been indexing pdf's with success up to this point, at which
>  point we put out about 4 gig of pdf's.  After about 135 very large
> pdf's the indexing quits, leaving only a content_tmp file and creating
> none of the other data files.

Big files are not handled efficiently because of the regular expressions.

>  I read in the faq there is a limit of
> 65,535 files, yet there is a work around.

That's a different (unrelated) limit.

>  First, I was wondering if
> each pdf page is treated as a file.  Second, has anyone had this problem
> and know the workaround?

There was a patch, but it was mainly for HTML files. If you are willing to 
integrate it, I will search for it and send it to you.