|Lets back up a minute and make sure we are on the same page. I was referring to flat files for the board here (messages). A forum data structure lends itself best to flat files. They will always be faster than a beefy SQL database.|
Why? It's the difference between know where your data is and having to search for your data. Look at the url to this thread. When we need this message, we know the forum (directory), and the file (post number). That allows us in one step to know instantly where this data is at. In turn, it passes it to the file system and we reap all the benefits of caching, and years of optimization.
When you go to a SQL engine, you are doing similar, but it is another layer of code. Not only does it have to figure out where the data is stored at via a index/key file, it has to then call the same file system routines to extract the data out of its own database. Then it has to reroll that data and deliver it to some application.
When you know where your data is at (which is half the battle), a flat file system will always be faster than a database - always.
Thats the qualifier though when know where your data is at.
Searching unknown or huge data sets, that's a whole different situation where dbs come into their own. When you don't know what or where the data you are trying to match is at, an optimized db with a key file or index file short circuts the search and is usually faster than a flat file.
That said, I've yet to see a real quality sql search engine written in Perl. They are very slow and require so much post processing that in the right environment, a good flat file se can still outperform sql (size being the drawback).
I've tried all the perl driven sql engines currently available ([url=http://www.searchtools.com]http://www.searchtools.com[/url]) and currently, there isn't anything that is faster than flat files on both my windows box and linux box when dbs are under 100meg.
Take FDSE for example - it has both a mysql and a flat file option with the flat file option being atleast 50% faster.
Anyway, that's part of the issues I'm facing here with the site search engine. I'm certainly looking for new options, but nothing that looks like "this is it". I've looked at swish and the other stand alones and they aren't what we want here - relevance is a big problem. Even wrote my own here, but the system load is just too huge.
Any recommendations? all ears..