Forum Moderators: open
Just a thought.
Richard Lowe
a secondary thought about site search, would it be possible to show a thread date (either last or first post) beside the search result? I usually look at the size to figure the number of posts but if you are looking for an answer on a subject that changes a lot it would be useful to know how old or new the thread was.
This is a flat file:
jatar_k¦¦no¦¦ip¦1027662426¦if I remember correctly Brett does everything in flat files.
mcguffin¦¦no¦¦ip¦1027663282¦Ok, can I request a translation of "flat files" please? I'm just not following what you mean.<ret>mcg
Flat file just means it is one record per line with the fields separated by a delimiter. Most often the pipe character or a comma.
[mysql.com...]
Why? It's the difference between know where your data is and having to search for your data. Look at the url to this thread. When we need this message, we know the forum (directory), and the file (post number). That allows us in one step to know instantly where this data is at. In turn, it passes it to the file system and we reap all the benefits of caching, and years of optimization.
When you go to a SQL engine, you are doing similar, but it is another layer of code. Not only does it have to figure out where the data is stored at via a index/key file, it has to then call the same file system routines to extract the data out of its own database. Then it has to reroll that data and deliver it to some application.
When you know where your data is at (which is half the battle), a flat file system will always be faster than a database - always.
Thats the qualifier though when know where your data is at.
Searching unknown or huge data sets, that's a whole different situation where dbs come into their own. When you don't know what or where the data you are trying to match is at, an optimized db with a key file or index file short circuts the search and is usually faster than a flat file.
That said, I've yet to see a real quality sql search engine written in Perl. They are very slow and require so much post processing that in the right environment, a good flat file se can still outperform sql (size being the drawback).
I've tried all the perl driven sql engines currently available ([url=http://www.searchtools.com]http://www.searchtools.com[/url]) and currently, there isn't anything that is faster than flat files on both my windows box and linux box when dbs are under 100meg.
Take FDSE for example - it has both a mysql and a flat file option with the flat file option being atleast 50% faster.
Anyway, that's part of the issues I'm facing here with the site search engine. I'm certainly looking for new options, but nothing that looks like "this is it". I've looked at swish and the other stand alones and they aren't what we want here - relevance is a big problem. Even wrote my own here, but the system load is just too huge.
Any recommendations? all ears..