Perlfect Solutions
 

[Perlfect-search] Indexing with http

Cesareo, Craig perlfect-search@perlfect.com
Tue, 22 Oct 2002 13:57:37 -0500
One follow up though: except for the directory that contains my web pages
with the links to the dynamic pages, and the cgi-bin, I have ALL folders in
my web structure disabled by using the no_index.txt file.

However, the indexer is somehow crawling out and indexing some of the
regular htm files in my site. It is indexing files that are in directories
disabled in the no_index.txt file.

Can I enter urls (http://) into the no_index.txt file so I can better
exclude files and sub-directories within my site? Right now I am using
absolute references to the directories on the server.



-----Original Message-----
From: Daniel Naber [mailto:daniel.naber@t-online.de]
Sent: Tuesday, October 22, 2002 12:26 PM
To: Cesareo, Craig; perlfect-search@perlfect.com
Subject: Re: [Perlfect-search] Indexing with http


On Tuesday 22 October 2002 18:30, Cesareo, Craig wrote:

> This is a link to the html page I wanted the indexer to start with:
>
> http://tss.oceusa.com/kcenter/solutions/WebOption/Html/solutions/solutio
>n_ur ls.htm

First, conf/no_index.txt has a default entry for cgi-bin, you need to 
remove that. Second, you need to set @HTTP_LIMIT_URLS to the top URL, e.g. 
@HTTP_LIMIT_URLS = ("http://tss.oceusa.com/");

regards
 Daniel

-- 
http://www.danielnaber.de