Mysql, index "size" field - Databases forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Mysql, index "size" field

csdude55

11:30 am on Jan 17, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I'm using PMA to play around with my indexes in an attempt to improve some of the query times.

I've never actually used the "size" field in an index, and really can't find a lot of info on it (probably because "mysql index size" finds results about the data size of the index, not the size field), and the MySQL docs are pretty vague on this. If I understand correctly, though, if I enter a "size" of, say, 6, then the index will store only the first 6 digits of the column.

I have a table with 2 columns:

id -> mediumint(9)
which -> varchar(12)

Both of the columns are in a UNIQUE index (which is used to prevent duplicates in an INSERT query). Column "which" currently has about 50 different possible values, and there are a little over 160,000 rows.

I found that one of my queries ran significantly faster if I added "which" to its own INDEX. Then for testing, I modified that index to a size of 6 (a random number) and found that it ran even faster.

Based on that, I decided to see which "size" would be best. I created 12 indexes on the "which" column; one with a size of 1, one with a size of 2, and so on. Then I used EXPLAIN to show me which index would be chosen by default, and which might give the fastest result.

The query was:

EXPLAIN
SELECT * FROM tableA WHERE which='whatever';

The first query chose the index with the size of 6 so I thought that, for whatever reason, 6 was the magic number. For that query, "whatever" was 9 characters long.

Then I changed the "WHERE which='whatever'" variable to a value that was 5 characters long, and EXPLAIN chose the index with the size of 3.

Then I changed it to a value that was 6 characters long, it decided that none of the "which" indexes were best and instead used the original UNIQUE index.

There doesn't seem to be any rhyme or reason to that!

Before I start playing with my larger tables and indexes, I'd like to figure out the logic behind how this works. How should I decide the "best" size of an index for the fastest result?

bhukkel

6:04 pm on Jan 17, 2016 (gmt 0)

10+ Year Member

I dont know what you mean by index size but:

mediumint(9) = 9 is the display width of the value. It has nothing to do with size. A mediumint is always 3 bytes.
varchar(12) = 12 is max string length it can store

An index has also an order so an index with order field1,field2 will be used for queries with field1 or "field1 and field2" but never for field2 alone. Thats why you need the second index.

csdude55

1:12 am on Jan 18, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

By "index size", I'm referring to a field in phpMyAdmin.

In PMA, click "Structure", then "+Indexes". In the list of indexes, click "Edit", and the form shows:


Index Name:
Comment:
Index type:

Column  Size

For testing, I plugged in USE INDEX(which) and compared it to USE INDEX(which_6). Using "which" the column "rows" returned 137,169, and using "which_6" returned 122,459.

jadebox

2:01 pm on Feb 10, 2016 (gmt 0)

10+ Year Member

I think the "size" that you are referring to is the number of characters from the column that will be used to build the index. For example, if you create an index on the 'which' column with a size of six, only the first six characters of each value in that column are indexed.

As to speed, it's a trade-off. Indexing less characters will make the index file smaller and quicker to search. But, it may make the query take longer because more more rows may match the index. And, it can be really bad if, for example, all or many of the values begin with the same characters. Making the number of characters larger will increase the size of the index, which will generally make select queries faster, but may make inserts take a little longer.

In any case, I wouldn't create multiple indexes on one column. That would be a waste of disk space and will slow down inserts and updates.

If you really need to speed up queries (as opposed to just wanting them to be as quick as possible), you may need to look into other ways in addition to indexes. If you are dealing with addresses, you might create a separate table for each state or whatever. Then, knowing the state, you'd only search addresses in that specific state.

Or you might do is to include some sort of "hash" column in your table that is indexed and search by that column. As an example, if you had two values often used for searching, you could insert something like MD5(CONCAT(value1 ,value2) into a column. When querying, you'd search for MD5(CONCAT('actualValue1' ,'actualValue2') in your query. Be careful doing tricky stuff like this because you're basically trying to do what the database software already does with indexes (and it probably does it better).

Sometimes, I'll have multiple tables with the same or similar data which goes against the general notion of normalizing data in databases, but can make some queries much faster because there are tables designed specifically for the queries.

My point is that there are often ways to set up your tables and the values in them to optimize them based on how you use them.

csdude55

3:56 am on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Making the number of characters larger will increase the size of the index, which will generally make select queries faster, but may make inserts take a little longer.

Is that a typo, or am I misunderstanding the logic? I'm trying to make SELECT queries faster, and I thought that a smaller index would do this?

I only made several indexes for the sake of testing which one would be faster. It doesn't make sense, though; on paper the first 3 characters are unique, but using an index size of 6 was generally the fastest. If I used the index size of 3, it was slower, even though that seems like the most logical choice.

FWIW, the column I'm testing has 51 different possible values, and the maximum VARCHAR size is 12.

jadebox

6:25 pm on Feb 11, 2016 (gmt 0)

10+ Year Member

An index is used to reduce the number of records that the query has to check. A larger index generally results in faster searches because each entry in the index is mapped to fewer possible matches.

For example, assume you were searching a list of common words. Start with a small index that divides the words into two bins (i.e. ones starting with A-M and ones starting with N-Z). When you search for a word like "Test," you would end up looking through all the words that start with N through Z.

Now, if you create a larger index with 26 bins, a bin for each possible first letter, then you would only look at words starting with "T" when searching for "Test."

The second search using the larger index would be faster.