homepage Welcome to WebmasterWorld Guest from 23.20.44.136
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderators: physics

Databases Forum

    
Slow Join
Frank_Rizzo




msg:4092127
 3:40 pm on Mar 5, 2010 (gmt 0)

I can't get a join to process as it takes too long. I don't know how long it will take as I stop it after a few minutes...

Two tables in separate databases

Database food
Table fruit (4,000,000 records)
---------------
date
fruit
location

Database users
Table user (175,567 records)
---------------
date
fruit
first name
last name

What I want to do is extract all names from user from a specific location.

e.g. In the fruit table find all records for 'north' and then display the users.user corresponding info (should be 9975 records)

Here is the explain

EXPLAIN SELECT a.* FROM user AS a JOIN food.fruit AS b ON b.date=a.date AND b.fruit=a.fruit WHERE b.location='north'

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,b,ref,location,location,10,const,9975,Using Where
1,SIMPLE,a,ALL,date,fruit,'','','',175567 Using where;Using join buffer

The first bit looks correct in that it has found the required 9975 rows but the second is not. There are indexes on date and fruit and yet it wants to pull all 175567 records from user rather than the 9975 corresponding records.

 

LifeinAsia




msg:4092164
 4:30 pm on Mar 5, 2010 (gmt 0)

Try:
EXPLAIN SELECT a.*
FROM food.fruit AS b JOIN user AS a ON b.date=a.date AND b.fruit=a.fruit
WHERE b.location='north'

You'll probably need an index on location as well.

Frank_Rizzo




msg:4092279
 7:03 pm on Mar 5, 2010 (gmt 0)

That pretty much did the same in that it was not using an index on user where it should have.

I changed the join to LEFT JOIN and it sort of works.

EXPLAIN SELECT a.* FROM user AS a LEFT JOIN food.fruit AS b ON b.date=a.date AND b.fruit=a.fruit WHERE b.location='north'

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,b,ref,location,location,10,const,9975,Using Where
1,SIMPLE,a,ref,date,fruit,date,3,food.b.date,175567,''

This time it used the index and the select gets the results in 3 seconds.

Obviously this is not right though as it is a left join and showing records from table a which it should not: 10570 rows are returned instead of 9975.

LifeinAsia




msg:4092321
 7:55 pm on Mar 5, 2010 (gmt 0)

What's in the extraneous records?

Frank_Rizzo




msg:4092341
 8:37 pm on Mar 5, 2010 (gmt 0)

This is getting more confusing now. It looks as if there should be 10570 records.

I ran this:

SELECT date,fruit,location FROM fruit WHERE location='north'

and that returns 10570 rows. I don't know why 9975 is returned in the EXPLAIN.

LifeinAsia




msg:4092415
 12:08 am on Mar 6, 2010 (gmt 0)

Maybe there are some fruit rows with no associated rows in the user table?

Frank_Rizzo




msg:4101012
 3:05 pm on Mar 19, 2010 (gmt 0)

I still got problems with this slow join.

Just to recap: I am trying to extract full record info from a log file. The fruit table (log) looks up records in the user table and a report is generated on the corresponding user records.

Here is the join statement I now using for testing:

SELECT COUNT(*) from users RIGHT JOIN food.fruit using (fruit,date) where fruit.fruit = 'apples' and fruit.location = 'north'

That returned in 150 seconds

The explain showed

fruit Using Where
users Using Where; Using Join

...

Next, to simplify things I significantly reduced the fruits table in order to eliminate / identify the fruits table as the cause. To do this I deleted all fruit records except apples

Fruits
11,647 records (previously 4,000,000)

Users
175,567 records

SELECT COUNT(*) from users RIGHT JOIN food.fruit using (fruit,date) where fruit.fruit = 'apples'

That returned in 3 seconds

The explain now shows:

fruit Using Where
users key:DATE, ref: food.fruit.DATE

So this is much faster presumably not just because of the index now in use but because of the much smaller fruit table.

I must still be doing something wrong though. Why is it that a count(*) takes 150 seconds. I haven't even started selecting * yet.

Is the right join on (abc,def) not the correct method? All I am trying to do is to pull master records from a lookup table where a logfile is the source.

Frank_Rizzo




msg:4101115
 5:40 pm on Mar 19, 2010 (gmt 0)

I got this fixed now.

There was no fruit index on the Users table! I had another table in there which already had an index called fruit. No wonder mysql would not use it.

join now super fast.

LifeinAsia




msg:4101123
 5:54 pm on Mar 19, 2010 (gmt 0)

Ah, glad you got it fixed. I didn't have any other suggestions at this point.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved