Forum Moderators: coopster

Message Too Old, No Replies

update db from csv php function

         

Scally_Ally

8:41 am on May 4, 2006 (gmt 0)

10+ Year Member



hi there,
I have a mysql database with a fair few records in it which mirrors the database of my clients (oracle). I will be receiving a csv file off them every month showing the new values of each of the records, and also new records that need to be inserted.
My question is this, can i actually do the update from the csv file to my database using php and if so what is the best way?
I am thinking that there are two ways i can do it, either by using the sql syntax LOAD DATA LOCAL INFILE etc etc, but my worry is that i would not be able to update the records that need it. The other way i was thinking was to make a php function that splits the text file and then increment through, see if the record exists and deal with it accordingly.

Has anyone ever tried this or would be able to shed any light on the subject and point me in the right direction? Any help greatly appreciated.

Thanks
Ally

jatar_k

2:11 pm on May 4, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



your worry about LOAD DATA is right, I would parse the csv and then act accordingly. You would have to check each record, I am assuming, to know whether to insert or update.

try this thread
[webmasterworld.com...] msg2

those steps talk about writing it to file, you would just need to swap that for db stuff.

Scally_Ally

2:28 pm on May 4, 2006 (gmt 0)

10+ Year Member



thanks jatar, that is very useful..

It looks like the best way for me will be to split the file via a php function as you suggested and then acting accordingly upon each individual item.

would there be any problem with the file size that you know of? my client says there will be somewhere around 2 million entries, making the size of the csv rather large - are there any implications of the loading time of the page that does the processing as i imagine that it will take a while to loop through all of these results?

I have set all the php.ini variables that need to be set so that it doesnt time out after a few minutes.

Thanks
Ally

jatar_k

2:37 pm on May 4, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



hmmmm, that is quite a bit

if the client is using oracle was there not an option to use oracle for the site as well?

I am imagining this would take a lot of work to change at this point but that would make life very simple

is this only changes? and its 2 million, man. You might have to split that up before running it

2 mil lines = min of 4 mil queries for the update

you better make sure you do it in off hours, it may take alot of time, even up to a couple of hours

Scally_Ally

2:50 pm on May 4, 2006 (gmt 0)

10+ Year Member



the initial insert into the database of all the records when they finally come to me will be 2 million plus (it is a nationwide campaign).

when the campaign starts running it will be only updates of the persons data where necessary - so it could be as little as 10,000 records that need updating, and this is only once a month (admittedly it could be over 1.5 million records that need updation).

My worry now though is that will be too much for mysql to handle. I think their plan was to add more entries into the database at a later date to cope with other contries involved taking it to over 11 million entries, although this is at a later date and i am not worried about it at the moment.

Just as long as i can handle the 2 million entries and do the updates successfully in the way i have mentioned.

What do you think? will it be able to run smoothly with that many entries?

Thanks
Ally

jatar_k

4:21 pm on May 4, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



initial insertion should be done with LOAD DATA then, that will be much quicker

hopefully it will not be too many updates every time, even 100K is no big deal, though it would still be a lot of queries.

I know people who tell me they run that many records all the time so I don't think it will be too much of a problem though if you need to search a lot you may see some slowness.

11 million, hmm not sure if you might start experiencing some problems there. You will also need to look at the machines this is running on at some point and see if they have the horsepower to handle volume, especially if you are doing updates through the site.

You could use a non transactional install of oracle and there would be no worries with these volumes. As I said though, that may not be an option. It is definitely something you should consider for the future I think.

Thing is it is really hard to say whether you will have issues or not, there are just too many variables involved.

what kind of traffic are we talking
how many concurrent users
what are the server specs
how many servers are there
what jobs are the servers doing
are you just reading from db or are there large numbers of writes as well (excluding the monthly update)

those are just a couple I can think of off the top of my head, there are a ton more

Scally_Ally

4:50 pm on May 4, 2006 (gmt 0)

10+ Year Member



that is really help thanks jatar.

What i will do is just go for it with mysql, once the large amount of data is loaded up there will only be one query of the table per visit (login) then i will pass their vairables across the site as needed. Also even though there may be a mid level amount of traffic there will not be any writes to the database, only reads by the user. the only writes are done in the administrator section.

Thanks again.
Ally