Perlfect Solutions
 

[Perlfect-search] Runtime Error - Windows 2000

Roee Rubin roee@irubin.com
Fri, 3 Aug 2001 11:21:56 -0700
This is a multi-part message in MIME format.

------=_NextPart_000_0000_01C11C0E.8137EB50
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: 8bit

I have attached the configuration file and also attempted to print out the
path from search.pl which turned out to be fine.

You can take a look at the by searching at www.dittodisk.com

Thanks for your help.

Roee Rubin
Irubin Consulting
roee@irubin.com
Phone (310) 402-0120
http://www.irubin.com/


-----Original Message-----
From: Daniel Naber [mailto:daniel.naber@t-online.de]
Sent: Thursday, August 02, 2001 4:35 PM
To: perlfect-search@perlfect.com
Cc: Roee Rubin
Subject: Re: [Perlfect-search] Runtime Error - Windows 2000


On Thursday 02 August 2001 23:16, you wrote:

> Software error:
> Cannot open :  at C:\projects\proj1\cgi-bin\perlfect\search\search.pl
> line 65.

Somehow the configured value from conf.pl doesn't seem to arrive in
search.pl.

You did run indexer.pl, didn't you?

What the contents of the data/ directory (including filesizes)?

You might want to add the print statement to search.pl, just before the
"tie" call.

If that doesn't help I suggest you mail me your conf.pl.

Regards
 Daniel

--
Daniel Naber, Paul-Gerhardt-Str. 2, 33332 Guetersloh, Germany
Tel. 05241-59371, Mobil 0170-4819674

------=_NextPart_000_0000_01C11C0E.8137EB50
Content-Type: application/octet-stream;
        name="conf.pl"
Content-Disposition: attachment;
        filename="conf.pl"
Content-Transfer-Encoding: quoted-printable

# Perlfect Search configuration file
#$rcs =3D ' $Id: conf.pl,v 1.27 2001/03/03 23:39:03 daniel Exp $ ' ;

# NOTE: Whenever you change one of the options that's marked with =
[re-index]
# you need to run indexer.pl again to make the change take effect.

#########################################################################=
##
### basic configuration
### You'll have to adapt these values if you didn't use setup.pl

# Where you want the indexer to start. [re-index]
$DOCUMENT_ROOT =3D 'C:/projects/ICS/httpdocs/';

# The base url of your site.
$BASE_URL =3D 'http://www.dittodisk.com/';

# The url in which Perlfect Search is located (usually somewhere in =
cgi-bin/).
$CGIBIN =3D $BASE_URL . "/cgi-bin/search/";

# The full-path of the directory where Perlfect Search is installed.
$INSTALL_DIR =3D 'C:/projects/ICS/cgi-bin/search/';

# Only files with these extensions should be indexed. [re-index]
@EXT =3D ("html", "htm", "shtml");

# If you do not have telnet/ssh access to the server that runs the =
script, you
# need to execute indexer.pl via CGI. Of course not everybody should be =
able
# to do that, so set a password with this option.
# ** WARNING ** : Only use this if absolutely necessary! Setting to "" =
disables=20
# execution as a CGI, which is much more secure. Note that other people =
on
# your server can probably read this file and look up your password.
$INDEXER_CGI_PASSWORD =3D "";


#########################################################################=
##
### http configuration
### You only need this if you want to index your pages via http

# Where you want the indexer to start via http. Leave empty if
# you want to index the files in the filesystem ($DOCUMENT_ROOT).
# ** WARNING **: Do not use for foreign servers! It might use too many
# resouces on other people's servers. [re-index]
# example: $HTTP_START_URL =3D 'http://localhost/';
$HTTP_START_URL =3D '';

# The indexer might not notice if it runs into an endless loop. To void
# that, set this to the maximum number of pages that will be visited
# (this can be bigger than the number of pages indexed). [re-index]
$HTTP_MAX_PAGES =3D 100;

# The web server's document root. Normally that's the same as =
$DOCUMENT_ROOT,
# it differs if you're only using Perlfect Search on a subdirectory. =
[re-index]
$HTTP_SERVER_ROOT =3D $DOCUMENT_ROOT;

# Only if indexing via http: limit crawling to this URL. This is an
# important setting so the script doesn't run out of control. [re-index]
$HTTP_LIMIT_URL =3D $HTTP_START_URL;

# Only if indexing via http: the content types to index. [re-index]
@HTTP_CONTENT_TYPES =3D ('text/html', 'text/plain');

# Set to 1 to get verbose output during indexing. [re-index]
$HTTP_DEBUG =3D 1;

#########################################################################=
##
### advanced configuration
### You only need this if you want to adapt advanced features

# Program that converts PDF to ascii text. pdftotext is part of xpdf, =
available
# at http://www.foolabs.com/xpdf/download.html. You also have to add =
"pdf"=20
# to @EXT and your PDF files must have a ".pdf" suffix. You can use any =
program
# that will print ASCII to STDOUT if called this way: "program =
pdf_filename -".=20
# WARNING: The PDFs filenames may not include special characters for =
security=20
# reasons, still it is recommended to use this option only to index your =
own=20
# files, not other people's files which filenames you cannot control. =
[re-index]
$PDFTOTEXT =3D '/usr/bin/pdftotext';

# How many results should be shown per page.
$RESULTS_PER_PAGE =3D 7;

# Show the ranking in percent, with the first document =3D 100%.
$PERCENTAGE_RANKING =3D 1;

# Do you want to index numbers? If so set $INDEX_NUMBERS to 1. =
[re-index]
$INDEX_NUMBERS =3D 0;

# If you don't have enough memory, set this to 1. This will slow down=20
# indexer.pl by a factor of about 2. Searching is not affected.
$LOW_MEMORY_INDEX =3D 1;

# How much of the document should be put in the index? With this option,
# the context of the match is shown on the results page. This only works
# if the match was in the first $CONTEXT_SIZE bytes of the document.
# Warning: Using this option will generate a very big index file.
# Set to 0 to disable, set to -1 for no limit. [re-index]
$CONTEXT_SIZE =3D 0;

# If $CONTEXT_SIZE is enabled, how many occurences of every term should =
be shown
# on the results page?
$CONTEXT_EXAMPLES =3D 2;

# If $CONTEXT_SIZE is enabled, how many words should be used to show the =
context
# of a term?
$CONTEXT_DESC_WORDS =3D 12;

# How many words should be used from the <BODY> of an html document as a =

# description for the document in case there is no <META description> =
tag=20
# available and $CONTEXT_SIZE is 0. [re-index]
$DESC_WORDS =3D 25;

# The minimum length of a word. Any word of smaller size is not indexed. =

# [re-index]
$MINLENGTH =3D 3;

# If you have umlauts or accents etc. in your text, enable this.
# With this option accented characters will be indexed as the characters
# they are based on (e.g. =E8 -> e, =FC -> u), without this option they =
will
# be filtered out completely (you don't want that). [re-index]
$SPECIAL_CHARACTERS =3D 1;

# The largest acceptable word size. Reducing this saves space but =
decreases
# result accuracy. Setting the variable to 0 ignores stemming =
alltogether and=20
# also makes the indexer a bit faster. [re-index]
$STEMCHARS =3D 0;

# Add URLs to the index, so one can search for them? Note that special
# characters will be ignored, just as in normal text. [re-index]
$INDEX_URLS =3D 0;

# You can completely ignore certain parts of your documents if you put =
these=20
# HTML comments around them. [re-index]
$IGNORE_TEXT_START =3D '<!--ignore_perlfect_search-->';
$IGNORE_TEXT_END =3D '<!--/ignore_perlfect_search-->';

# How much more important are words found in the title, in the meta =
values
# (author, description, keywords), and in the headlines compared to =
normal=20
# text in the body? This influences the ranking of the results.
# Use any integer (0 =3D ignore that text completely) [re-index]
$TITLE_WEIGHT =3D 5;
$META_WEIGHT =3D 5;
$H_WEIGHT{'1'} =3D 5;   # headline <h1>...</h1>
$H_WEIGHT{'2'} =3D 4;
$H_WEIGHT{'3'} =3D 3;
$H_WEIGHT{'4'} =3D 1;
$H_WEIGHT{'5'} =3D 1;
$H_WEIGHT{'6'} =3D 1;   # headline <h6>...</h6>

# If you want to log the queries to an extra file, set this to 1.
# Every use of search.pl will then be logged to data/log.txt. That file
# has to exist and must be writable for the webserver. The line format =
is:
# REMOTE_HOST;date;terms;matches;current page;(time to search in =
seconds);
# NOTE: if you have many queries, this file will grow quite fast.
$LOG =3D 0;

# This will increase the score of results that contain more than one of
# the searched terms. Queries with only one term will not be affected.
# The number given here is a factor that multiplies the score (even
# several times, if there are more than two terms). 0 turns it off.
$MULTIPLE_MATCH_BOOST =3D 0;

# Directory with templates (normally you don't have to modify this).
$TEMPLATE_DIR =3D $INSTALL_DIR.'templates/';

# What's the default language. This is the language that's used if no =
lang
# parameter is passed to the script or if the parameter is invalid.
$DEFAULT_LANG =3D 'en';

# The result template for several languages.
#$SEARCH_TEMPLATE{'en'} =3D $TEMPLATE_DIR.'search.html';
$SEARCH_TEMPLATE{'en'} =3D =
'C:/projects/ICS/httpdocs/html/searchresults.html';
$SEARCH_TEMPLATE{'de'} =3D $TEMPLATE_DIR.'search_de.html';
#$NO_MATCH_TEMPLATE{'en'} =3D $TEMPLATE_DIR.'no_match.html';
$NO_MATCH_TEMPLATE{'en'} =3D =
'C:/projects/ICS/httpdocs/html/searchresults_.html';
$NO_MATCH_TEMPLATE{'de'} =3D $TEMPLATE_DIR.'no_match_de.html';

# The text for the "Next Page" link in several languages.
$NEXT_PAGE{'en'} =3D 'Next';
$NEXT_PAGE{'de'} =3D 'n&auml;chste Seite';

# The text for the "Previous Page" link in several languages.
$PREV_PAGE{'en'} =3D 'Previous';
$PREV_PAGE{'de'} =3D 'vorige Seite';

#########################################################################=
##
### You shouldn't have to edit anything below this line.

# Various paths (do NOT use system-wide /tmp for security reasons!)
$TMP_DIR  =3D $INSTALL_DIR.'temp/';
$DATA_DIR =3D $INSTALL_DIR.'data/';
$CONF_DIR =3D $INSTALL_DIR."conf/";
$STOPWORDS_FILE =3D $CONF_DIR.'stopwords.txt';
$NO_INDEX_FILE =3D $CONF_DIR.'no_index.txt';
$LOGFILE =3D $DATA_DIR.'log.txt';
$SEARCH =3D 'search.pl';
$SEARCH_URL =3D $CGIBIN.$SEARCH;

# Paths to the database files.
$INV_INDEX_DB_FILE =3D $DATA_DIR.'inv_index';
$DOCS_DB_FILE      =3D $DATA_DIR.'docs';
$SIZES_DB_FILE     =3D $DATA_DIR.'sizes';
$TERMS_DB_FILE     =3D $DATA_DIR.'terms';
$DF_DB_FILE        =3D $DATA_DIR.'df';
$TF_DB_FILE        =3D $DATA_DIR.'tf';
$CONTENT_DB_FILE   =3D $DATA_DIR.'content';
$DESC_DB_FILE      =3D $DATA_DIR.'desc';
$TITLES_DB_FILE    =3D $DATA_DIR.'titles';

# Paths to the temporary database files.
$INV_INDEX_TMP_DB_FILE =3D $DATA_DIR.'inv_index_tmp';
$DOCS_TMP_DB_FILE      =3D $DATA_DIR.'docs_tmp';
$SIZES_TMP_DB_FILE     =3D $DATA_DIR.'sizes_tmp';
$TERMS_TMP_DB_FILE     =3D $DATA_DIR.'terms_tmp';
$CONTENT_TMP_DB_FILE   =3D $DATA_DIR.'content_tmp';
$DESC_TMP_DB_FILE      =3D $DATA_DIR.'desc_tmp';
$TITLES_TMP_DB_FILE    =3D $DATA_DIR.'titles_tmp';

# Official version number.
$VERSION =3D "3.20";
1;

------=_NextPart_000_0000_01C11C0E.8137EB50--