| 6:46 pm on Jul 25, 2002 (gmt 0)|
I'm thinking about something like that Rich. It's a pretty intense routine though. I'll put it on the list and bookmark this thread.
| 6:52 pm on Jul 25, 2002 (gmt 0)|
+1 on this.
Or at least pull up all posts by someone when I plug his/her username into the site search.
| 6:57 pm on Jul 25, 2002 (gmt 0)|
that would be interesting
a secondary thought about site search, would it be possible to show a thread date (either last or first post) beside the search result? I usually look at the size to figure the number of posts but if you are looking for an answer on a subject that changes a lot it would be useful to know how old or new the thread was.
| 6:59 pm on Jul 25, 2002 (gmt 0)|
I agree that post date and user post would be very helpful if added to the site search.
| 5:11 am on Jul 26, 2002 (gmt 0)|
Site search engine is a hurting unit at the moment. Trying to figure out some new options (it's just so huge now and going to get worse).
| 5:39 am on Jul 26, 2002 (gmt 0)|
How is the data stored? Are you using flat files or a real database for the posts?
| 5:47 am on Jul 26, 2002 (gmt 0)|
if I remember correctly Brett does everything in flat files.
| 6:01 am on Jul 26, 2002 (gmt 0)|
Ok, can I request a translation of "flat files" please? I'm just not following what you mean.
| 6:06 am on Jul 26, 2002 (gmt 0)|
Yes I use flatfiles here. I've got a mysql option worked in, but it is much too slow for our load.
This is a flat file:
jatar_k¦¦no¦¦ip¦1027662426¦if I remember correctly Brett does everything in flat files.
mcguffin¦¦no¦¦ip¦1027663282¦Ok, can I request a translation of "flat files" please? I'm just not following what you mean.<ret>mcg
Flat file just means it is one record per line with the fields separated by a delimiter. Most often the pipe character or a comma.
| 6:08 am on Jul 26, 2002 (gmt 0)|
| 6:08 am on Jul 26, 2002 (gmt 0)|
Or did you mean for the site search Ggrot? I'd love to move that to sql, but there are very few quality sql search engines out there.
| 6:12 am on Jul 26, 2002 (gmt 0)|
Sorry. Flat files are generally a basic file that stores all the data. To find the data, one has to search the files. Databases are sophisticated peices of software which allow you to search faster and with more powerful options due to the way they are organized.
| 6:20 am on Jul 26, 2002 (gmt 0)|
mySql is actually slower than flat files? How are you doing the flat file search?! Also, are you doing a fulltext index on the fields that you are searching with?
| 6:49 am on Jul 26, 2002 (gmt 0)|
|Lets back up a minute and make sure we are on the same page. I was referring to flat files for the board here (messages). A forum data structure lends itself best to flat files. They will always be faster than a beefy SQL database.|
Why? It's the difference between know where your data is and having to search for your data. Look at the url to this thread. When we need this message, we know the forum (directory), and the file (post number). That allows us in one step to know instantly where this data is at. In turn, it passes it to the file system and we reap all the benefits of caching, and years of optimization.
When you go to a SQL engine, you are doing similar, but it is another layer of code. Not only does it have to figure out where the data is stored at via a index/key file, it has to then call the same file system routines to extract the data out of its own database. Then it has to reroll that data and deliver it to some application.
When you know where your data is at (which is half the battle), a flat file system will always be faster than a database - always.
Thats the qualifier though when know where your data is at.
Searching unknown or huge data sets, that's a whole different situation where dbs come into their own. When you don't know what or where the data you are trying to match is at, an optimized db with a key file or index file short circuts the search and is usually faster than a flat file.
That said, I've yet to see a real quality sql search engine written in Perl. They are very slow and require so much post processing that in the right environment, a good flat file se can still outperform sql (size being the drawback).
I've tried all the perl driven sql engines currently available ([url=http://www.searchtools.com]http://www.searchtools.com[/url]) and currently, there isn't anything that is faster than flat files on both my windows box and linux box when dbs are under 100meg.
Take FDSE for example - it has both a mysql and a flat file option with the flat file option being atleast 50% faster.
Anyway, that's part of the issues I'm facing here with the site search engine. I'm certainly looking for new options, but nothing that looks like "this is it". I've looked at swish and the other stand alones and they aren't what we want here - relevance is a big problem. Even wrote my own here, but the system load is just too huge.
Any recommendations? all ears..
| 7:06 am on Jul 26, 2002 (gmt 0)|
I could tell you the secret to handling large amounts of data in milliseconds but then I would have to kill you. :)
A mixture of databases and flat files. Use the PROs of both systems and avoid the CONs of both systems.