Perlfect Solutions
 

[Perlfect-search] Indexing password-protected sections: A Solution

David Cross perlfect-search@perlfect.com
Sat, 30 Nov 2002 17:40:17 +0000 (GMT)
In reply to the thread I started at:

http://perlfect.com/pipermail/perlfect-search/2002-October/001503.html

I have found a way to get any password-protected section of a website
indexed when indexing via HTTP:

1. Open indexer_web.pl

2. Locate the line near the top of the file that starts:

my $http_user_agent = LWP::UserAgent->new;
my $host = "";

3. Change it to this:

my $http_user_agent = LWP::UserAgent->new;
$http_user_agent->agent('UserAgent you want to mimic');
$http_user_agent->credentials(
    'www.your.domain:PORT',
    'Realm',
    'username' => 'password'
  );
my $host = "";

** Please remember to save the file as UNIX ASCII.

4. Notes and example:

my $http_user_agent = LWP::UserAgent->new;
$http_user_agent->agent('Perlfect Search/3.30');
$http_user_agent->credentials(
    'www.yourdomain.com:80',
    'Enter your access details',
    'open' => 'sesame'
  );
my $host = "";

* $http_user_agent->agent string is any UserAgent you want to mimic

* www.yourdomain.com:80 - any HTTP domain and the port number which is
usually Port 80

* "Enter your access details" - this is the "Realm" or the *exact* bit of
text on the username/password dialog above where it asks for your username
and password

* Username/password pair are the username and password you normally enter
to access the site.

5. Running indexer now will index the protected section.

You can read more about the CPAN module LWP::UserAgent and other variables
you can add at:

http://search.cpan.org/author/GAAS/libwww-perl/lib/LWP/UserAgent.pm

I hope you find this useful.