|Mysql question, need a pointer :)|
primary key; index?
Could someone point me to a reference place where I can once and for all understand the importance of defining a primary key and specifying an index please. It's one of those things I should have learnt a long time ago, but generally keep forgetting.
Also when setting the collation of the table, I usually leave that empty and let the default setting go in, though I think it just assigns a default char set to the data stored on the table, could someone tell me if this is a bad idea, I stress though that I have had NO issues of late, it's just something I have noticed just whilst setting up a new database.
Thanks in advance,
The ultimate reference for MySQL is the MySQL website of course ;). But regarding indexes you should think of the following:
- Indexes improve performance if your data is sorted and your SELECT queries ask for subsets of the data based on one of the sorting ranges. All orders from last week, all male visitors with a specific shoe size, etc. Without indexes the queries will still work, but the MySQL engine has to sift through all the data to find the records which match the criteria. With indexed fields, it can ignore large quantities of the records at once which makes queries often significant faster.
- Indexes are necessary if you want to relate fields from different tables with foreign keys.
- Indexes degrade performance for write-only tables. For example for log tables where you don't run daily queries on.
Hi there lammert.
Thanks for the explanation. I shall look into it further when i get back to my laptop.
|Indexes degrade performance for write-only tables. |
To add to this: a simplified, early explanation an old master gave me, indexes slow down writing and updating, but speed up searching. So don't just index everything, index only the fields that will require some form of seeking/searching. This isn't always "where field="value", it's also the joins mentioned.
Since M is asking about indexing, an important point that is relevant, one I've (shamefully) never really understood or gotten a good answer for, how do you know what is a "good" length for indexing on text fields?
Integer fields need no length,
user_id int(11), index(user_id),
Indexing tiny fields is worthless (or really, overhead, no need)
But text fields benefit greatly by indexing:
email varchar(255), index (email(6))
The closest I've come to an answer is if the index length is the smallest average indicator of most significant difference between other records for that field, it's "probably" an optimized index length. I'm not happy with "probably." :-)
From the docs [dev.mysql.com]:
|CREATE INDEX part_of_name ON customer (name(10)); |
If names in the column usually differ in the first 10 characters, this index should not be much slower than an index created from the entire name column.
Huh? "Not much slower?" Then why index(10), why not index 3, or 6?)
In the email example above, an index of 1 would be of little benefit. As the index length increases, it becomes more beneficial, but increases the size of the index table (right?) So how much is "not enough," how much is "too much?"
I suppose a testing environment and experimentation would give some answers, but "correlation does not always equal causation" and I'm not sure this would lead to a straight answer. There must be some form of formula or guideline to go by, I've never found it.
MySQL documentation on Indexing [dev.mysql.com] doesn't really answer it for me. :-)
Hi there Rocknbil,
Thanks for trying to give clarity there. It seems like I need to sit down and really try to understand the intricacies of setting up a database/table to get the best performance out of it.
Whenever I have set up tables in the past I get a little error message stating that I have defined no index, but I always ignore it, and as yet I have noticed no degradation in data retrieval. Though I have never sat there with a stopwatch either ;-p
I was just wondering if there was any fundamental flaw to being ignorant to the use of indexes, and from what has been posted, it seems as there is no definitive answer, well, at least one succinct enough to comprehend in one sentence.
Cheers for posting guys,
|and from what has been posted, it seems as there is no definitive answer, |
Well, lammert summed it up nicely, there **is** a definitive answer, and the documentation links above explain it in more detail. Indexes allow the engine to ignore large quantities of rowsets. Take the scenario of a million records, it may have to only search say, 100, 200,000 rows of an indexed field being seeked/searched instead of the whole million. That is, really, the root of it.
The expansion above, to which I don't have an answer for, is exactly how long that index should be for any given data set to be fully optimized, and I may have confused you more with it, sorry. I guess I'm saying it would be great if "the optimum index length should be 30% of the length of the average expected data" but there is no such recommendation, that I can find.
|well, at least one succinct enough to comprehend in one sentence. |
No argument there.