Perlfect Solutions
 

[Perlfect-search] PDF's: Content Identical

Charles Chilcote perlfect-search@perlfect.com
Thu, 22 Jan 2004 11:45:20 -0500
This is a multi-part message in MIME format.

------=_NextPart_000_002B_01C3E0DD.368E5E40
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hello,

I've just started using the perlfect search, and have found it very =
helpful.  Indexing and searching works great, however, when I've tried =
to include pdf's into the indexing process, I don't get the expected =
results.  The first pdf file is fetched and processed, but the remaining =
pdf's are all ignored (line items included below).  I also cannot find =
the first (supposedly indexed) pdf in the search results.  I most likely =
have some setting incorrect, but I can't find it.  Any help would be =
appreciated.

Thanks,
Charles
http://www.nitterhouse.com


The first pdf file encountered:
Fetched  =
'http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Con=
nections/PC_AngleTieBackBearing02.pdf', 186964 bytes
    13: =
http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Conn=
ections/PC_AngleTieBackBearing02.pdf (182.58 KB)

All remaining pdf files:
Ignoring =
'http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Con=
nections/PC_AngleTieBackBearing01.pdf': content identical to =
'http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Con=
nections/PC_AngleTieBackBearing02.pdf'

------=_NextPart_000_002B_01C3E0DD.368E5E40
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1264" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT size=3D2>Hello,</FONT></DIV>
<DIV><FONT size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2>I've just started using the perlfect search, and =
have found it=20
very helpful.&nbsp; Indexing and&nbsp;searching works great, however, =
when I've=20
tried to include pdf's into the indexing process, I don't get the =
expected=20
results.&nbsp; The first pdf file is fetched and processed, but the =
remaining=20
pdf's are all ignored (line items included below).&nbsp; I also cannot =
find the=20
first (supposedly indexed) pdf in the search results.&nbsp; I most =
likely have=20
some setting incorrect, but I can't find it.&nbsp; Any help would be=20
appreciated.</FONT></DIV>
<DIV><FONT size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2>Thanks,</FONT></DIV>
<DIV><FONT size=3D2>Charles</FONT></DIV>
<DIV><FONT size=3D2></FONT><FONT size=3D2><A=20
href=3D"http://www.nitterhouse.com">http://www.nitterhouse.com</A></FONT>=
</DIV>
<DIV><FONT size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2><U>The first pdf file encountered:</U></FONT></DIV>
<DIV>Fetched&nbsp;=20
'http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Con=
nections/PC_AngleTieBackBearing02.pdf',=20
186964 bytes<BR>&nbsp;&nbsp;&nbsp; 13:=20
http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Conn=
ections/PC_AngleTieBackBearing02.pdf=20
(182.58 KB)<BR></DIV>
<DIV><FONT size=3D2><U>All remaining pdf files:</U></FONT></DIV>
<DIV>Ignoring=20
'http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Con=
nections/PC_AngleTieBackBearing01.pdf':=20
content identical to=20
'http://www.nitterhouse.com/DrawingSpecs/DrawingSpecsSub/PDFs/Panel%20Con=
nections/PC_AngleTieBackBearing02.pdf'<BR></DIV></BODY></HTML>

------=_NextPart_000_002B_01C3E0DD.368E5E40--