Perlfect Solutions
 

[Perlfect-search] Re: Indexing Microsoft Documents

Davone Vang davone.vang@gen21.com
Mon, 29 Oct 2001 08:17:29 -0700
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C1608C.D2C33160
Content-Type: text/plain;
        charset="ISO-8859-1"

Cameron,

I downloaded the already compiled Windows version of antiword and
incorporated the patches that Yuriy suggested.  But when I try to index my
files, I get the following errors.  It cannot execute antiword.exe :



Using DB_File...
Checking for old temp files...
Building string of special characters...
Loading 'no index' regular expressions:
        - d:/inetpub/wwwroot/cgi-bin
        - d:/inetpub/wwwroot/PSS/images
Loading stopwords...Done.
Starting crawler...
d:/inetpub/wwwroot/PSS/
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
d:/inetpub/wwwroot/PSS/070600
d:/inetpub/wwwroot/PSS/BBF136
d:/inetpub/wwwroot/PSS/PSS_A2-C-202a
d:/inetpub/wwwroot/PSS/PSS_PPImedia
d:/inetpub/wwwroot/PSS/_vti_cnf
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
Crawler finished(15 files, 0 terms)

Calculating weight vectors:
0%  10%  20%  30%  40%  50%  60%  70%  80%  90%  100%
|----|----|----|----|----|----|----|----|----|----|
>
Copying hash values to database files...
        d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/inv_index_tmp
        d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/docs_tmp
        d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/sizes_tmp
        d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/desc_tmp
        d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/titles_tmp
        d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/terms_tmp
Renaming newly created db files...
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/terms_tmp to
d:/inetpu
/wwwroot/cgi-bin/perlfect/search/data/terms
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/docs_tmp to
d:/inetpub
wwwroot/cgi-bin/perlfect/search/data/docs
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/sizes_tmp to
d:/inetpu
/wwwroot/cgi-bin/perlfect/search/data/sizes
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/titles_tmp to
d:/inetp
b/wwwroot/cgi-bin/perlfect/search/data/titles
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/content_tmp to
d:/inet
ub/wwwroot/cgi-bin/perlfect/search/data/content
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/desc_tmp to
d:/inetpub
wwwroot/cgi-bin/perlfect/search/data/desc
         d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/inv_index_tmp to
d:/in
tpub/wwwroot/cgi-bin/perlfect/search/data/inv_index
Indexer finished.



Any help you or anyone else can give me would be greatly appreciated.
Thanks a lot.

Davone




-----Original Message-----
From: Cameron Moore [mailto:lists@toad.bitstreet.net]
Sent: Friday, October 26, 2001 11:26 AM
To: perlfect-search@perlfect.com
Subject: Re: [Perlfect-search] Re: Indexing Microsoft Documents


* yuriy@mint-tech.com [2001.10.26 10:50]:
>    I'm using UNIX, Sun Solaris.  I don't know how to do it under Windows,
>    but I suggest you start looking for "to text" converters.  Good luck.
>    
>    Yuriy
>    
>    -----Original Message-----
>    From: perlfect-search-admin@perlfect.com
>    [mailto:perlfect-search-admin@perlfect.com]On Behalf Of Davone Vang
>    Sent: Friday, October 26, 2001 11:28 AM
>    To: yuriy@mint-tech.com; perlfect-search@perlfect.com
>    Subject: [Perlfect-search] Re: Indexing Microsoft Documents
>    
>      Yuriy,
>      It's been awhile since I got back to my question about indexing
>      Microsoft documents and searching them using Perlfect.  I remember
>      you responding to my question that it worked for you by installing
>      "catdoc" and "xls2csv" then adding some code to the config.pl,
>      indexer_filesystem.pl, and indexer_web.pl files.  One more question
>      I have for you is did you do this on a Windows system or a UNIX
>      system?  Because I'm using a Windows system.  I hope you still
>      remember what I'm talking about.  If you don't, our emails are in
>      the October Perlfect mailing archive.  Hope to hear from you.
>      Thanks for you help.
>      
>      Davone

Davone,
Forget catdoc.  Get antiword.  It runs on many different platforms.  Use
the same patches that Yurly listed before, and it will work fine.
Antiword is at http://www.winfield.demon.nl/index.html.  Enjoy
-- 
Cameron Moore
[ Whatever happened to preparations A through G? ]
_______________________________________________
perlfect-search mailing list
perlfect-search@perlfect.com
To unsubscribe, set other personal options or view the list archives please
visit:



------_=_NextPart_001_01C1608C.D2C33160
Content-Type: text/html;
        charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

Cameron,

I downloaded the already compiled Windows version of = antiword and incorporated the patches that Yuriy suggested.  But = when I try to index my files, I get the following errors.  It = cannot execute antiword.exe :



Using DB_File...
Checking for old temp files...
Building string of special characters...
Loading 'no index' regular expressions:
        - = d:/inetpub/wwwroot/cgi-bin
        - = d:/inetpub/wwwroot/PSS/images
Loading stopwords...Done.
Starting crawler...
d:/inetpub/wwwroot/PSS/
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
d:/inetpub/wwwroot/PSS/070600
d:/inetpub/wwwroot/PSS/BBF136
d:/inetpub/wwwroot/PSS/PSS_A2-C-202a
d:/inetpub/wwwroot/PSS/PSS_PPImedia
d:/inetpub/wwwroot/PSS/_vti_cnf
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
Cannot execute
'c:/antiword/antiword.exe
mpfile -':  at indexer.pl line 424.
Crawler finished(15 files, 0 terms)

Calculating weight vectors:
0%  10%  20%  30%  40%  = 50%  60%  70%  80%  90%  100%
|----|----|----|----|----|----|----|----|----|----|
>
Copying hash values to database files...
        = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/inv_index_tmp
        = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/docs_tmp
        = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/sizes_tmp
        = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/desc_tmp
        = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/titles_tmp
        = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/terms_tmp
Renaming newly created db files...
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/terms_tmp to = d:/inetpu
/wwwroot/cgi-bin/perlfect/search/data/terms
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/docs_tmp to = d:/inetpub
wwwroot/cgi-bin/perlfect/search/data/docs
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/sizes_tmp to d:/inetpu</= FONT>
/wwwroot/cgi-bin/perlfect/search/data/sizes
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/titles_tmp to = d:/inetp
b/wwwroot/cgi-bin/perlfect/search/data/titles
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/content_tmp to = d:/inet
ub/wwwroot/cgi-bin/perlfect/search/data/content
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/desc_tmp to = d:/inetpub
wwwroot/cgi-bin/perlfect/search/data/desc
         = d:/inetpub/wwwroot/cgi-bin/perlfect/search/data/inv_index_tmp to = d:/in
tpub/wwwroot/cgi-bin/perlfect/search/data/inv_index
Indexer finished.



Any help you or anyone else can give me would be = greatly appreciated.  Thanks a lot.

Davone




-----Original Message-----
From: Cameron Moore [mailto:lists@toad.bitstreet.net= ]
Sent: Friday, October 26, 2001 11:26 AM
To: perlfect-search@perlfect.com
Subject: Re: [Perlfect-search] Re: Indexing = Microsoft Documents


* yuriy@mint-tech.com [2001.10.26 10:50]:
>    I'm using UNIX, Sun = Solaris.  I don't know how to do it under Windows,
>    but I suggest you start = looking for "to text" converters.  Good luck.
>   
>    Yuriy
>   
>    -----Original = Message-----
>    From: = perlfect-search-admin@perlfect.com
>    [mailto:perlfect-searc= h-admin@perlfect.com]On Behalf Of Davone Vang
>    Sent: Friday, October 26, = 2001 11:28 AM
>    To: yuriy@mint-tech.com; = perlfect-search@perlfect.com
>    Subject: [Perlfect-search] = Re: Indexing Microsoft Documents
>   
>      Yuriy,
>      It's been awhile = since I got back to my question about indexing
>      Microsoft = documents and searching them using Perlfect.  I remember
>      you responding to = my question that it worked for you by installing
>      = "catdoc" and "xls2csv" then adding some code to the = config.pl,
>      = indexer_filesystem.pl, and indexer_web.pl files.  One more = question
>      I have for you is = did you do this on a Windows system or a UNIX
>      system?  = Because I'm using a Windows system.  I hope you still
>      remember what I'm = talking about.  If you don't, our emails are in
>      the October = Perlfect mailing archive.  Hope to hear from you.
>      Thanks for you = help.
>     
>      Davone

Davone,
Forget catdoc.  Get antiword.  It runs on = many different platforms.  Use
the same patches that Yurly listed before, and it = will work fine.
Antiword is at http://www.winfield.demon.nl/index.html.  = Enjoy
--
Cameron Moore
[ Whatever happened to preparations A through G? = ]
_______________________________________________
perlfect-search mailing list
perlfect-search@perlfect.com
To unsubscribe, set other personal options or view = the list archives please visit:
http://perlfect.com/mailman/listinfo/perlfect-search</= A>
=1A

------_=_NextPart_001_01C1608C.D2C33160--