homepage Welcome to WebmasterWorld Guest from 54.211.219.178
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderator: open

Databases Forum

    
Index in join query
Index in join query
fahad direct




msg:4217400
 12:02 am on Oct 16, 2010 (gmt 0)

Hi,I am using left join in my 2 tables having around 2 million records which takes around 8 hours to run,

SELECT t1.id,t1.expense FROM t1 LEFT JOIN t2
ON (t1.expense=t2.expense AND t1.code=t2.code AND t1.name=t2.name AND t1.price=t2.price AND t1.charges=t2.charges AND t1.url=t2.url) WHERE t2.id is NULL

What should be the best index for this query as i haven't indexed yet to speed up the process from hours to minutes.

 

DWarp9




msg:4217767
 11:03 pm on Oct 16, 2010 (gmt 0)

The first thing I notice when reading your post, is the large amount of columns you join on, which probably means a great deal of duplicate data between your tables. Have you considered whether it is possible to denormalize your tables? If the multiplicity of the relationship between the tables is small, that might be a good option.

That said, it is usually a good idea to put indexes on the columns you seek on, e.g. those that you join on (expense, code, name, price, charges, url and id).
Specifically, you could try creating a compound index on each table containing the columns you join on (expense, code, name, price, charges, url in that order), and an index on the id column of table t2.

Hope it helps.
-Peter

fahad direct




msg:4217852
 4:53 am on Oct 17, 2010 (gmt 0)

Thanks Peter, I have tried by putting index on field id of both tables and checked by explain query which worked well and time from 8 hours have been decreased to around 40 minutes which is great but if i can do it more tweak to get more fast and optimised results.

DWarp9




msg:4217898
 10:31 am on Oct 17, 2010 (gmt 0)

Glad to hear it! That's 10x improvement... Let's see if we can get another 10x. ;-)

If you haven't tried it yet, I would try the compound index, that is one index covering all the columns you join on in the same order as you use them in your query. In your case, one index covering expense, code, name, price, charges and url in that order.

How many transactions do the tables get?
Also if you could post your table definitions I might be able to provide more help.

fahad direct




msg:4245543
 10:08 am on Dec 23, 2010 (gmt 0)

It works fine and gets back to me with in 30 minutes in case of 1 million records but if it is touching 1.5 million of the records it never finishes.

Below are explain results, it returns me 2 rows as:

id=1, select_type=SIMPLE, table=t1, type=ALL, possible_keys=NULL, key=NULL, key_len=NULL, ref=NULL, rows=1548249, extra=''


id=1, select_type=SIMPLE, table=t2, type=ref, possible_keys=(model,mer) key=mer, key_len=304, ref=t1.id, rows=1, extra=Using where; Not exists

t1 table size= 704MB, t2 table size= 521MB

Any help please.

DWarp9




msg:4245559
 12:06 pm on Dec 23, 2010 (gmt 0)

Well, it seems you are joining on a table with 1.5 million rows without ever using an index. This is bad news.

My best suggestion - which could be better if your provided you database schemata - is to get rid of t2 all together and just add a boolean column to t1. Instead of adding a row to t2 you would just check this field (update it to true). Finally you would add an index to this column and do the query as such:

Select id, expense From t1 Where newColumn = false;

I guarantee you this query will be fast, even with 1.5 mill rows. :)

Alternatively, the big index on all your joined columns are my only suggestion. But let my know how it works.

fahad direct




msg:4245610
 4:00 pm on Dec 23, 2010 (gmt 0)

Thanks Peter, I have to actually compare the records of 2 tables each having approximately 1.5 million records. I am joining both tables and finding updated or new records in t1 table.
I am going with your second option and already using the big index on all my joined columns.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved