Welcome to WebmasterWorld Guest from 54.162.226.212

Forum Moderators: open

Message Too Old, No Replies

Why does this query take 80 seconds?

     
8:40 am on Apr 5, 2012 (gmt 0)

10+ Year Member



I have a table with 500,000 rows in it.

- 20 fields
- indexes set up an all the fields in the where

Here's the query

SELECT * FROM table

WHERE (startDate BETWEEN '2012-05-10' AND '2012-05-16')

AND (productCode = 'a1' OR productCode = 'a2' OR productCode = 'a3' OR productCode = 'a4' OR productCode = 'a5' OR productCode = 'a6' OR productCode = 'a7' OR productCode = 'a8' OR productCode = 'a9' OR productCode = 'a10' OR productCode = 'a11' OR productCode = 'a12' OR productCode = 'a13' OR productCode = 'a14') AND value >= '2' AND startDate != '2012-05-13' AND productName != '' AND time = '3' AND hasExtras = '1';

Thanks!
3:13 pm on Apr 5, 2012 (gmt 0)

WebmasterWorld Senior Member whoisgregg is a WebmasterWorld Top Contributor of All Time 10+ Year Member



What do you see when you add an "EXPLAIN [dev.mysql.com]" before that SELECT statement?
10:52 am on Apr 6, 2012 (gmt 0)

5+ Year Member



One of the best ways I've found to speed up a query is creating indexes, which unfortunately you've already done.

I don't know if the "IN" operator will speed things up (although according to the manual, it could), but it does make the query shorter.

Also not sure whether >= and <= are any faster than BETWEEN, but it might be worth an experiment.

Other experiments could be worthwhile, such as moving to the top of the query conditions that will exclude the most records, or conditions that are fastest to evaluate.

Here is a possible rewrite as an example:

SELECT * FROM table
WHERE
productName != ''
AND value >= '2'
AND time = '3'
AND hasExtras = '1'
AND startDate >= '2012-05-10'
AND startDate <= '2012-05-16'
AND startDate != '2012-05-13'
AND productCode IN ('a1','a2','a3','a4','a5','a6','a7','a8','a9','a10','a11','a12','a13','a14')
;
4:09 pm on Apr 6, 2012 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



A few questions:
1) Do you really need to select *, or can you get away with selecting fewer fields?
2) Is StartDate a VARCHAR field or DATETIME field?
3) Have you tried rebuilding/reorganizing/refreshing (depending on the terminology of your DBMS) your indexes?
4) What's your DB server configuration like? Do you have sufficient memory and CPU? Are there other users/processes on the DB server that could be slowing things down for you?
10:29 am on Apr 7, 2012 (gmt 0)

5+ Year Member



If this table gets many insertions and deletions, you might also try defragging and compacting it (purging deleted rows) with:

OPTIMIZE TABLE tablename;


I've been assuming MySQL, and only realized from LifeinAsia's comments that that's not necessarily the case. Not sure which of my suggestions might be MySQL-only.
2:53 pm on Apr 7, 2012 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Oh, also noticed the value='2' part. Are all the values in that field numeric? If so, changing the field from a VARCHAR should speed up things a bit.
9:29 am on Apr 10, 2012 (gmt 0)

10+ Year Member



Thank you very much for the replies.

I think a penny might have just dropped.

Using explain, it appears that the ORDER in which I put things in a query has an effect on how many rows are used in a query. If put say a product id first and then the date range, it makes a big impact on the number of rows queried.

Is that correct? If so then I can drastically speed up my queries by re-writing them.
4:16 pm on Apr 10, 2012 (gmt 0)

WebmasterWorld Senior Member whoisgregg is a WebmasterWorld Top Contributor of All Time 10+ Year Member



it appears that the ORDER in which I put things in a query has an effect on how many rows are used in a query

That is correct. In general, you want your more-restrictive, index-using conditions to appear first in your WHERE clause.

That way, whichever portions of the select end up not taking advantage of an index are working on a smaller subset of rows.
12:57 pm on Apr 11, 2012 (gmt 0)

WebmasterWorld Senior Member eelixduppy is a WebmasterWorld Top Contributor of All Time 5+ Year Member



>> That is correct.

Is this true of other enterprise dbms? I would think the query optimizer would pick up on this, doesn't make sense that it wouldn't.

>> One of the best ways I've found to speed up a query is creating indexes

This will work in most cases if the index is properly set up. However, in cases where a large portion of the data is going to be returned from a query, indexing may actually be detrimental to performance.
4:12 pm on Apr 11, 2012 (gmt 0)

WebmasterWorld Senior Member whoisgregg is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I would think the query optimizer would pick up on this, doesn't make sense that it wouldn't.

Most of the time it does. :)

If you are doing just a few simple
field_a='x' AND field_b='y' AND field_c='z'
then you won't notice any difference no matter what order you put them in. I think that's where the query optimizer automatically "fixes" the order.

However, if you are doing very complex queries then the order of the WHERE can make a difference. (I'm talking about nested ORs, subselects, that sort of thing.)
 

Featured Threads

Hot Threads This Week

Hot Threads This Month