[Perlfect-search] data dump script (and xml)
Rob Stevenson email@example.com
Mon, 7 May 2001 08:45:14 -0300
On Sun, May 6, 2001 will said...
>>On Friday 27 April 2001 01:12, you wrote:
>>> the way perlfect works is really very good for indexing xml files in
>>> a simple way: all you have to do is change the parsing for title and
>>> description and body to the relevant xml fields and you get a nice,
>>> if limited, low-overhead free-text search of your xml data. no expat,
>>> no nasty anything. very impressed. is that something that should be
>>> developed? might be able to help, if so.
>>That sounds very useful. It looks like all we need is two configuration
>>options "TITLE" and "BODY" (maybe "DESCRIPTION", too), which default to
>>"title" and "body". If you have a patch already, please send it. If not I
>>will add this to my version 3.21 TODO list.
>i don't have a patch but i'd be happy to (learn how to) make one.
>it's an interesting possibility, i agree: the strength of your
>indexing would combine well with the added structure of xml, and it's
>very simple to teach the indexer to read in data according to xml
>rather than html conventions. the best, and most distinctive, thing
>about it for me is that it allows html and xml (and pdf) to sit side
>by side in the same index and come under the same weighting and
>searching system without ever really caring about the data format.
>there are some issues to consider, though:
I believe what you're aiming for is a search engine which could be
considered to a be part of "The Semantic Web". Have a look at...
No need to reinvent this particular wheel. Just look up Dublin Core and
RDF in the many sites that cover this field. You could start at the site
I maintain, below. (It uses perlfect search of course.)
Rob Stevenson - CIMI web manager