Welcome to WebmasterWorld Guest from 188.8.131.52
- separate indexing script (I mean the indexing takes place in one step, storing the results in, say a mysql db, and the searching takes place off that db) [added] hm - I think that this should work off a mysql db, but maybe flat files would be sufficient? The site's nearing 200 pages or so[/added]
- results ordered by relevance, depending where words found
- highlighting matched words
- support for pdf
- "no index" word skip list
- "no index" page skip list
- "no index" tags for skipping parts of pages
Anyone use anything in this area?
Mind you, I'm not looking necessarily for a complete out-of-the-box solution, but at least one I can work from. The only thing I could find that seemed to come relatively close is Zoom by wrensoft.com - the only main thing it doesn't do is stock the data in a mysql db (it works off of flat files). The rest looks pretty good.
I'm actually quite surprised not to find more scripts for this. I guess everyone does their own!
give it a try. So did I and after having invested some time, it was worth the effort because PHPDig´s rudiments are quite good and really easily to modify.
First take a look at the basic tables created in your database. That should make you an easy start and the search algorithm is yet rather clear.
So you need to understand how indexing works. It´s not as complicated as one may think as long you have the table structure and its meaning in mind - and now the (okay, _your_) work begins. Make some use of regular expressions to locate words, skip passages etc. Use a wrapper or whatever to read and parse PDFs - anyway, these are the main lines of code you have to extend.