Welcome to WebmasterWorld Guest from 54.196.147.57

Forum Moderators: open

Message Too Old, No Replies

Can someone help me understand this please?

Better to update a temporary table, or do live updates?

     

Elric99

11:43 am on Jan 9, 2013 (gmt 0)

10+ Year Member



Hello

I have a lot of data that has to be downloaded from an API and inserted into mysql.

Here's the current setup:

1) download the data to mytable_temp
2) delete all data from mytable
3) Copy data from mytable_temp to mytable

I thought this would prevent data not showing up in users searches during updates.

However, this is causing problems:

1) If my connection breaks during an update (which can take 3 hours) udates don't complete.

2) mysql slow log is large - some visitor queries being affected during the delete and copy to live table phases.

My question is: would it be better to remove the temporary table and update the actual table real-time?

I'm concerned that if the 'red widget' row is being updated at the exact moment a user searches for a 'red widget' the data won't show.

Thanks!

Tom

LifeinAsia

6:20 pm on Jan 9, 2013 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



would it be better to remove the temporary table and update the actual table real-time?

No. If you do this and your connection breaks, you could be left with an empty mytable until you reconnect and complete the update.

which can take 3 hours

Which part is taking the longest? Just how much data are you updating?

I have tackled this issue in 2 different ways, depending on the situation:
A) Rename mytable to mytable_old and mytable_temp to mytable

B) Instead of deleting all the data from mytable and copying from mytable_temp, do it in chunks. (This works only if there is a close relationship between the new/old data.) In other words, let's say there is a ProductID field:
DELETE mytable
WHERE ProductID BETWEEN 1 AND 10000

INSERT mytable
SELECT *
FROM mytable_temp
WHERE ProductID BETWEEN 1 AND 10000

Dinkar

2:07 pm on Jan 12, 2013 (gmt 0)

10+ Year Member



A) - I would try this first.

B) - Isn't it better to use TRUNCATE TABLE?

LifeinAsia

3:47 pm on Jan 13, 2013 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



B) - Isn't it better to use TRUNCATE TABLE?

TRUNCATE TABLE removes ALL the data in the table. That negates the whole point of doing the update chunks at a time.

Dinkar

12:00 am on Jan 14, 2013 (gmt 0)

10+ Year Member



Yes, I know that it will remove all the data. I was thinking about DELETE and TRUNCATE and thought that it's better than DELETE.

But you are right, it negates the whole point of doing update in chunks.

Elric99

10:04 am on Jan 14, 2013 (gmt 0)

10+ Year Member



Thanks for the help.

I do have a column called productId and I had been using:

WHERE productID = 'AB1234' OR productID = 'AB2234' OR productID = 'AB3234'

I didn't know that you could use alphanumeric characters in BETWEEN

I've changed to:

WHERE productID BETWEEN 'AB1234' AND 'AB4567'

And this has shaved a great deal off the query time.

Thanks again

LifeinAsia

5:09 pm on Jan 14, 2013 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



I didn't know that you could use alphanumeric characters in BETWEEN

Just remember that it's doing a character by character comparison. (So 'AB1234' is actually LESS than 'AB125' even though 125 is less than 1234.)

And I assume you already have an index on productID?

piatkow

9:33 pm on Jan 15, 2013 (gmt 0)

WebmasterWorld Senior Member piatkow is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



I don't know MYSQL as my DBA background was on big boxes but a classic way, if there are no referential integrity issues, is to create the temp table as identical to the original and then rename the pair of them.

If an update takes 3 hours then it sounds as if you need a faster connection or a faster box (or both).
 

Featured Threads

Hot Threads This Week

Hot Threads This Month