Perlfect Solutions
 

[Perlfect-search] indexing PDF files by page?

Giorgos Zervas perlfect-search@perlfect.com
17 Dec 2003 19:09:42 +0200
Well, if you have an index page with links to the smaller articles these
will be indexed by the crawler regardless of their format(pfd or html)
as long as you have configured pdf indexing correctly.

Whether you will use pdf or html is up to you, but generally speaking
html is more suitable for online viewing and pdf for printing. 

Regards,
Giorgos

On Wed, 2003-12-17 at 19:06, Mike Scarborough wrote:
> Hello,
> 
> I'm sorry.  The larger problem I am trying to solve is this:  i have several large pages (such as http://uther.dlib.vt.edu/~mscarbor/MxAmWar/RW/RW24i2-16.htm), that contain many smaller articles.  I would like the user to be able to search, and jump to individual stories.  So, I planned to convert them to PDF (which has for some reason surprisingly shed KBs), roughly one per page.  Then have the individual pages indexed, and be able to provide an acrobat open command to jump to that page.  
> 
> either that, or just break down the pages into hundreds of smaller pages, one article per page.  could anyone recommmend a method to do that?  
> 
> --mike
> 
> 
> 
> 
> Message: 2
> To: perlfect-search@perlfect.com
> Subject: Re: [Perlfect-search] indexing PDF files by page?
> Date: Tue, 16 Dec 2003 22:22:32 -0500
> From: Jerrad Pierce <belg4mit@MIT.EDU>
> Reply-To: perlfect-search@perlfect.com
> 
> What problem are you trying to solve here?
> The PDF are likely to be larger than the HTML files...
> IIRC Perlfect has a mechanism for highlighting the keywords on the page.
> If not, this would not be too difficult to add (just create a CGI to
> filter the page). In the process you could also add an anchor to the first
> occurence of the term and jump to that.
> 
> 
> 
> 
> http://partners.adobe.com/asn/acrobat/sdk/public/docs/PDFOpenParams.pdf
> 
> documents the PDF equivalent of an anchor, it seems to work in Acrobat Reader 5
> 
> 
> _______________________________________________
> perlfect-search mailing list
> perlfect-search@perlfect.com
> To unsubscribe, set other personal options or view the list archives please visit:
> 
>