Forum Moderators: open
I am using MySQL and php to manage a large site. Every month i get sent through a csv file with over 250,000 records, this is meant to update the online database of 370,000 with points gained, changes of names etc.
In order to do this i have bought some 3rd party software, which updates the database on a primary key from both the text file and the online database. Problem is this takes a long time to search the entire database for each of the 250,000 records (process takes over 7 days)... groan..
I am looking for a quicker way to turn this around, what i have been thinking of is uploading the 250,000 record update file into a new table in the same database and then merging the two from a script that i am writing, only i dont know of any efficient sql to do this.
Here is an outline of what i was thinking.
table1 = original file
table2 = update file
SELECT
table1.field1,
table1.field2,
table2.field1
WHERE
table1.field1 = table2.field1
(UPDATE
table1
SET
table1.field2 = (table1.field2 + table2.field2)
WHERE ID = table2.ID
)
I know its a bit difficult to see when just written like this but for every record in table2 the corresponding row form table1 must be found, when this is found add a value of a column from table1 to a value of a column on table2. And all this has to execute in less than an hour..
Hope i have made myself clear, if not let me know and i will post more info..
Thanks
Ally
I had a similar scanario that was doing about 30,000 updates in a single line with a set based update and taking 15 seconds , but causing locks on the table during that time so that other users were getting errors.
I changed back to a code loop and it takes a couple of minutes now, but I can error handle easier and users are not locked out at all. The error log shows which rows failed, with db transaction/rollback its all or nothing.
I would usually reccomend doing the SET based updates with transactions and error handling on the database not using code, but in practice its not always do-able.