Forum Moderators: coopster

Message Too Old, No Replies

Which index to use?

Unique...fulltext...etc

         

grnidone

11:21 pm on Dec 13, 2003 (gmt 0)



Jonknee brought up a really good point in this thread about indexes:
[webmasterworld.com...]

I am actually in the middle of trying to figure out which index I need to use, but I can't find any rules on the MySQL site to figure out whats a good way to go. Heck, it doesn't really explain the differences between them except in cryptic language.

Can someone just tell me what the differences are of each index and under what situations are the best to use them? And how does using an index make queries to the database faster?

daosmith

7:25 am on Dec 14, 2003 (gmt 0)

10+ Year Member



Hi grnidone,

I don't know the answer to your first question, but the reason that an index makes database queries faster is that the index stores references to the table rows sorted on the criteria specified in the index. It's similar conceptually to having a permanent ORDER BY clause.

Indices use data structures like b-trees to store the row references, which means that a search for a particular row involves only a few steps; searching through the (unindexed) table manually involves n steps if there are n rows i.e. linear search.

So it makes sense to create certain indices on table columns if you are going to be doing a lot of searching for various values of that column.

Hope this helps :)

jamesa

10:22 am on Dec 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> And how does using an index make queries to the database faster?

Best analogy I've heard goes something this. Think of a Rolodex where all the cards are in alphabetical order with tabs separating the letters of the alphabet. When you look up a name, say Smith, you go right for the "S" tab and flip through just the "S" cards until you find Smith. If there are a ton of cards in "S" section then once you flip to the "S" tab you would probably jump to the middle of the S's since Smith wouldn't likely be at the very beginning or the very end and start from there. Since they are in alpha order you could get to the card you want real quick. That's an index. A database without an index would be like the Rolodex cards stacked in no particular order where all you could do is flip through them one by one until you found what you were looking for.

>> under what situations are the best to use them?

In practical terms what you do is look at the types of queries that are typically performed on the database and create the indexes accordingly. A database can have more than one index, which in the Rolodex example would be like having the cards sorted by last name when you need to find someone by name, but magically they are also sorted by city when you're looking by city, etc.

So let's say we have database table called "friends" that looks like:

+---------------------------------+ 
¦ custname ¦ address ¦ phone ¦ birthdate ¦
+---------------------------------+

A query like:

SELECT * FROM friends WHERE name="smith"
would be much faster if there was an index on the custname column. One way to create an index on the custname column is:

CREATE INDEX choose_a_name ON friends (custname)

If you find that you often do queries like

SELECT * FROM friends WHERE phone like '(212)%'
(everyone in the 212 area code) then you should have an index on the phone column also:

CREATE INDEX choose_another_name ON friends (phone)

In both the examples above the WHERE clause specified a condition that occurred in *one* column. If you have a query that searches across two columns, like

SELECT * FROM friends WHERE custname='smith' AND phone like '(212)%'
then you need an index like:

CREATE INDEX yet_another_name ON friends (custname, phone)

The order is important. The order of the index much match the order in the query. By the way the index above will also optimize searches on just the custname column alone, so you wouldn't need the custname index we did earlier.

>> what the differences are of each index

The examples so far are what I'd call, for lack of a better term, regular indexes. Now if you have text fields that contain a larger amount of text - say full product descriptions - then regular indexes aren't that great for finding a word or small phrase that may be contained therein (unless they happen to be at or near the beginning of the field). That's where fulltext indexes come to the rescue. Fulltext indexes also give you the ability to sort the matches by relevence.

Ok, stop me now before I write the book ;). HTH.

And as to why a table called friends would have a column called custname, I don't know, but I'm not going back and editing all that now ;)

dcrombie

11:51 am on Dec 14, 2003 (gmt 0)



I spent most of last week optimising a site that's getting a bit too popular for the database server. The process went something like:

* log all queries and their execution time during a peak period
* start with the most popular queries, then the most time-consuming
* make sure you aren't selecting any unnescessary fields
* make sure you aren't linking any unnescessary tables
* don't ORDER BY if you don't have to (PHP can do it post-SELECT)

Then:

* use EXPLAIN to examine your queries
* create an INDEX as explained above
* compare the EXPLAIN results - if it didn't improve then delete the index
* repeat as nescessary

Note: _every_ time you add an index, you need to re-examine any queries that use that table.
;)

grnidone

10:52 pm on Dec 14, 2003 (gmt 0)



Daosmith, thank you for the quickie explaination, JamesA, thank you for the book.

Why the MySQL help docs don't have more examples in it, I'll never understand. The roladex thing makes it so much clearer.

Dcrombie, you said:

Note: _every_ time you add an index, you need to re-examine any queries that use that table.

Are you saying one needs to examine queries in the process you just outlined to see if the index you just made is actually helping the process or is there another reason one needs to do that?

And I've never heard of the EXPLAIN command. *G looks it up...* It says you can use it to see when you need to add indexes to speed up selects, but it is a little cryptic as to how one knows that from doing the command.

Can you go into that a little more? It seems to run along the same subject of this thread.

dcrombie

9:21 am on Dec 15, 2003 (gmt 0)



I use PostgreSQL more than MySQL - because the command-line interface is a million times better - not to mention the documentation ;)

This URL [postgresql.org] might be useful in explaining indexes. The documentation states that:

... a query or data manipulation command can use at most one index per table.

There is a danger then, that when you create a new index, a query will use it in preference to the index you want it to use. There are also cases where a query will run faster with no index. You can force a query to use a specific index - but I'm not up to that level yet ;)