Welcome to WebmasterWorld Guest from 54.196.116.152

Forum Moderators: open

Message Too Old, No Replies

Can someone help me understand this please?

Better to update a temporary table, or do live updates?

     
11:43 am on Jan 9, 2013 (gmt 0)

Junior Member

10+ Year Member

joined:May 7, 2005
posts: 143
votes: 0


Hello

I have a lot of data that has to be downloaded from an API and inserted into mysql.

Here's the current setup:

1) download the data to mytable_temp
2) delete all data from mytable
3) Copy data from mytable_temp to mytable

I thought this would prevent data not showing up in users searches during updates.

However, this is causing problems:

1) If my connection breaks during an update (which can take 3 hours) udates don't complete.

2) mysql slow log is large - some visitor queries being affected during the delete and copy to live table phases.

My question is: would it be better to remove the temporary table and update the actual table real-time?

I'm concerned that if the 'red widget' row is being updated at the exact moment a user searches for a 'red widget' the data won't show.

Thanks!

Tom
6:20 pm on Jan 9, 2013 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5618
votes: 44


would it be better to remove the temporary table and update the actual table real-time?

No. If you do this and your connection breaks, you could be left with an empty mytable until you reconnect and complete the update.

which can take 3 hours

Which part is taking the longest? Just how much data are you updating?

I have tackled this issue in 2 different ways, depending on the situation:
A) Rename mytable to mytable_old and mytable_temp to mytable

B) Instead of deleting all the data from mytable and copying from mytable_temp, do it in chunks. (This works only if there is a close relationship between the new/old data.) In other words, let's say there is a ProductID field:
DELETE mytable
WHERE ProductID BETWEEN 1 AND 10000

INSERT mytable
SELECT *
FROM mytable_temp
WHERE ProductID BETWEEN 1 AND 10000
2:07 pm on Jan 12, 2013 (gmt 0)

Full Member

10+ Year Member

joined:July 12, 2002
posts:207
votes: 1


A) - I would try this first.

B) - Isn't it better to use TRUNCATE TABLE?
3:47 pm on Jan 13, 2013 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5618
votes: 44


B) - Isn't it better to use TRUNCATE TABLE?

TRUNCATE TABLE removes ALL the data in the table. That negates the whole point of doing the update chunks at a time.
12:00 am on Jan 14, 2013 (gmt 0)

Full Member

10+ Year Member

joined:July 12, 2002
posts:207
votes: 1


Yes, I know that it will remove all the data. I was thinking about DELETE and TRUNCATE and thought that it's better than DELETE.

But you are right, it negates the whole point of doing update in chunks.
10:04 am on Jan 14, 2013 (gmt 0)

Junior Member

10+ Year Member

joined:May 7, 2005
posts: 143
votes: 0


Thanks for the help.

I do have a column called productId and I had been using:

WHERE productID = 'AB1234' OR productID = 'AB2234' OR productID = 'AB3234'

I didn't know that you could use alphanumeric characters in BETWEEN

I've changed to:

WHERE productID BETWEEN 'AB1234' AND 'AB4567'

And this has shaved a great deal off the query time.

Thanks again
5:09 pm on Jan 14, 2013 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5618
votes: 44


I didn't know that you could use alphanumeric characters in BETWEEN

Just remember that it's doing a character by character comparison. (So 'AB1234' is actually LESS than 'AB125' even though 125 is less than 1234.)

And I assume you already have an index on productID?
9:33 pm on Jan 15, 2013 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member piatkow is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 5, 2006
posts:3332
votes: 23


I don't know MYSQL as my DBA background was on big boxes but a classic way, if there are no referential integrity issues, is to create the temp table as identical to the original and then rename the pair of them.

If an update takes 3 hours then it sounds as if you need a faster connection or a faster box (or both).