Forum Moderators: open
The software requires the rdf's to be used.
also the description states:
by Bryn Dole
This version of search is much smaller, much faster, and much simpler than the original Isearch. The Open Directory Search does not run on Windows. It currently only supports Linux and Solaris. I've included some sample scripts for parsing the Open Directory RDF data so that you too can build your own Internet search engine. Provided you have a few Sun E4500s lying around. ;)
I gave it a pre-1.0 release number because of the newness of the release, not the code. I figure I'll go through a few iterations in the packaging, at which point I'll up the release number to 1.0. If more than a few people start making suggestions, I'll get motivated to put the source up on the mozilla.org CVS server.
This version of Open Directory Search understands the UTF-8 encoding, and interprets individual Chinese (and Japanese) characters as "words" allowing searches to be done on individual Chinese characters and combinations of characters. The initial data that is indexed and the searches have to be in UTF-8 for this to work. It will also work for other non-UTF-8 charsets, but may require spaces or punctuation between words for search to work properly.
it can be found at:
[dmoz.org...]
Of course that isn't true at all....I run the odp rdf on my home computer and serve it on a freebsd server. I learned a million years ago (back when computers had vacuum tubes) that you do it in chunks, and you kick off errors into a log and merge the chunks. This isn't rocket science or bank accounting, so a few errors don't get into the database..no big deal.
Everyone who has worked with the DMOZ RDF's knows it contains numerous errors...my first software used to go "kerchunnk" frequently, so I just bypass errors now and sleep like a baby at night.
The software shown by netscape creates massive database's and blows up with every error. Just locating and fixing every error is a task for 1000 monkeys...especially if your input is sabotaged by disgruntled editors.
I would suggest they do rdf's by category and then merge...they need to create an error log and put a meta to work on the input errors....but what do I know, I'm just a little old fat retired web slave trying to make enough money to buy a castle in the south of France with 1200 hectares surrounding it. (I have $2.3 to go, to reach my dream)