Perlfect Solutions
 

[Perlfect-search] Problem with indexing remote server

Hertenstein, Robert rhertenstein@mail.drms.dla.mil
Fri, 26 Apr 2002 12:22:56 -0400
I have a few web servers that I am working with. One of them is on an
intranet while the other one is viewable by the world. I am trying to user
Perlfect to index the WWW server from my development server. Unfortuately,
every time I try running the script, it runs into the following problems:

Checking for old temp files...
Building string of special characters...
Loading 'no index' regular expressions:
        - /cern/cgi-bin/*
Loading stopwords...371 stopwords loaded.
Starting crawler...
Note: I will not visit more than $HTTP_MAX_PAGES=100 pages.
DEBUG: url = http://www.xxx.xxx.xxx/
DEBUG: Response = HTTP::Response=HASH(0x4078abfc)
Error: Couldn't get 'http://www.xxx.xxx.xxx/': response code 500
Crawler finished: indexed 0 files, 0 terms (0 different terms),
ignored 0 files because of conf/no_index.txt

I put in some debug messages to make sure the information is coming in
correctly, and it seems to be doing that. I have checked the error and
access log on the remote system, and it does not show that any connection
has been attempted through port 80. If I remove the remote URL, the problem
goes away, but it only indexes the filesystem at that point. Any help with
this would be greatly appreciated


Thanks

Rob H