Welcome to WebmasterWorld Guest from 54.196.224.166

Forum Moderators: open

Message Too Old, No Replies

Slow Join

     

Frank_Rizzo

3:40 pm on Mar 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I can't get a join to process as it takes too long. I don't know how long it will take as I stop it after a few minutes...

Two tables in separate databases

Database food
Table fruit (4,000,000 records)
---------------
date
fruit
location

Database users
Table user (175,567 records)
---------------
date
fruit
first name
last name

What I want to do is extract all names from user from a specific location.

e.g. In the fruit table find all records for 'north' and then display the users.user corresponding info (should be 9975 records)

Here is the explain

EXPLAIN SELECT a.* FROM user AS a JOIN food.fruit AS b ON b.date=a.date AND b.fruit=a.fruit WHERE b.location='north'

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,b,ref,location,location,10,const,9975,Using Where
1,SIMPLE,a,ALL,date,fruit,'','','',175567 Using where;Using join buffer

The first bit looks correct in that it has found the required 9975 rows but the second is not. There are indexes on date and fruit and yet it wants to pull all 175567 records from user rather than the 9975 corresponding records.

LifeinAsia

4:30 pm on Mar 5, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Try:
EXPLAIN SELECT a.*
FROM food.fruit AS b JOIN user AS a ON b.date=a.date AND b.fruit=a.fruit
WHERE b.location='north'

You'll probably need an index on location as well.

Frank_Rizzo

7:03 pm on Mar 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That pretty much did the same in that it was not using an index on user where it should have.

I changed the join to LEFT JOIN and it sort of works.

EXPLAIN SELECT a.* FROM user AS a LEFT JOIN food.fruit AS b ON b.date=a.date AND b.fruit=a.fruit WHERE b.location='north'

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,b,ref,location,location,10,const,9975,Using Where
1,SIMPLE,a,ref,date,fruit,date,3,food.b.date,175567,''

This time it used the index and the select gets the results in 3 seconds.

Obviously this is not right though as it is a left join and showing records from table a which it should not: 10570 rows are returned instead of 9975.

LifeinAsia

7:55 pm on Mar 5, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



What's in the extraneous records?

Frank_Rizzo

8:37 pm on Mar 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is getting more confusing now. It looks as if there should be 10570 records.

I ran this:

SELECT date,fruit,location FROM fruit WHERE location='north'

and that returns 10570 rows. I don't know why 9975 is returned in the EXPLAIN.

LifeinAsia

12:08 am on Mar 6, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Maybe there are some fruit rows with no associated rows in the user table?

Frank_Rizzo

3:05 pm on Mar 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I still got problems with this slow join.

Just to recap: I am trying to extract full record info from a log file. The fruit table (log) looks up records in the user table and a report is generated on the corresponding user records.

Here is the join statement I now using for testing:

SELECT COUNT(*) from users RIGHT JOIN food.fruit using (fruit,date) where fruit.fruit = 'apples' and fruit.location = 'north'

That returned in 150 seconds

The explain showed

fruit Using Where
users Using Where; Using Join

...

Next, to simplify things I significantly reduced the fruits table in order to eliminate / identify the fruits table as the cause. To do this I deleted all fruit records except apples

Fruits
11,647 records (previously 4,000,000)

Users
175,567 records

SELECT COUNT(*) from users RIGHT JOIN food.fruit using (fruit,date) where fruit.fruit = 'apples'

That returned in 3 seconds

The explain now shows:

fruit Using Where
users key:DATE, ref: food.fruit.DATE

So this is much faster presumably not just because of the index now in use but because of the much smaller fruit table.

I must still be doing something wrong though. Why is it that a count(*) takes 150 seconds. I haven't even started selecting * yet.

Is the right join on (abc,def) not the correct method? All I am trying to do is to pull master records from a lookup table where a logfile is the source.

Frank_Rizzo

5:40 pm on Mar 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I got this fixed now.

There was no fruit index on the Users table! I had another table in there which already had an index called fruit. No wonder mysql would not use it.

join now super fast.

LifeinAsia

5:54 pm on Mar 19, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Ah, glad you got it fixed. I didn't have any other suggestions at this point.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month