I can't get a join to process as it takes too long. I don't know how long it will take as I stop it after a few minutes...
Two tables in separate databases
Table fruit (4,000,000 records)
Table user (175,567 records)
What I want to do is extract all names from user from a specific location.
e.g. In the fruit table find all records for 'north' and then display the users.user corresponding info (should be 9975 records)
Here is the explain
EXPLAIN SELECT a.* FROM user AS a JOIN food.fruit AS b ON b.date=a.date AND b.fruit=a.fruit WHERE b.location='north'
1,SIMPLE,a,ALL,date,fruit,'','','',175567 Using where;Using join buffer
The first bit looks correct in that it has found the required 9975 rows but the second is not. There are indexes on date and fruit and yet it wants to pull all 175567 records from user rather than the 9975 corresponding records.
EXPLAIN SELECT a.*
FROM food.fruit AS b JOIN user AS a ON b.date=a.date AND b.fruit=a.fruit
You'll probably need an index on location as well.
That pretty much did the same in that it was not using an index on user where it should have.
I changed the join to LEFT JOIN and it sort of works.
EXPLAIN SELECT a.* FROM user AS a LEFT JOIN food.fruit AS b ON b.date=a.date AND b.fruit=a.fruit WHERE b.location='north'
This time it used the index and the select gets the results in 3 seconds.
Obviously this is not right though as it is a left join and showing records from table a which it should not: 10570 rows are returned instead of 9975.
What's in the extraneous records?
This is getting more confusing now. It looks as if there should be 10570 records.
I ran this:
SELECT date,fruit,location FROM fruit WHERE location='north'
and that returns 10570 rows. I don't know why 9975 is returned in the EXPLAIN.
Maybe there are some fruit rows with no associated rows in the user table?
I still got problems with this slow join.
Just to recap: I am trying to extract full record info from a log file. The fruit table (log) looks up records in the user table and a report is generated on the corresponding user records.
Here is the join statement I now using for testing:
SELECT COUNT(*) from users RIGHT JOIN food.fruit using (fruit,date) where fruit.fruit = 'apples' and fruit.location = 'north'
That returned in 150 seconds
The explain showed
fruit Using Where
users Using Where; Using Join
Next, to simplify things I significantly reduced the fruits table in order to eliminate / identify the fruits table as the cause. To do this I deleted all fruit records except apples
11,647 records (previously 4,000,000)
SELECT COUNT(*) from users RIGHT JOIN food.fruit using (fruit,date) where fruit.fruit = 'apples'
That returned in 3 seconds
The explain now shows:
fruit Using Where
users key:DATE, ref: food.fruit.DATE
So this is much faster presumably not just because of the index now in use but because of the much smaller fruit table.
I must still be doing something wrong though. Why is it that a count(*) takes 150 seconds. I haven't even started selecting * yet.
Is the right join on (abc,def) not the correct method? All I am trying to do is to pull master records from a lookup table where a logfile is the source.
I got this fixed now.
There was no fruit index on the Users table! I had another table in there which already had an index called fruit. No wonder mysql would not use it.
join now super fast.
Ah, glad you got it fixed. I didn't have any other suggestions at this point.