joined:Oct 4, 2001
I doubt this applies in the case of the OP, just responding for the sake accuracy :-)
When you approach a billion records you might think about sub-tables.
So, say, 750 million records before you start to rethink your schema? I could see that working for data that you only need to lookup occasionally, which is probably what you're referring to.
However, if it's data that's getting accessed frequently, especially anything that visitors or members can cause to be accessed, 100's of millions of records is a bad idea IMO.
With a table that size it doesn't matter which field you're indexing, the index file is going to be huge... and (up to) the entire file has to be read from disk or memory for every search. During which:
A) Memory IO is tied up (provided you've bought the extra RAM to cover that, otherwise disk IO gets tied up and then you've really got a problem).
B) The script calling the database query waits for a response, holding data in memory and burning CPU cycles.
In the end the outcome is that you keep your visitors waiting longer, and buy extra servers you don't really need, when a simpler solution is to break the data apart using some form of simple logic that can be handled by the script before the database is queried.