| 11:09 am on Dec 16, 2009 (gmt 0)|
When you say insert do you mean via a thousand INSERT statements?
Have you tried LOAD DATA INFILE
If it is the program which is transferring the text files to mysql it could be doing the inserts rather than loading the data. It could also be a problem with bufferring / committing to disk on every insert and /or updating of the indexes.
Need more info.
| 8:01 pm on Dec 17, 2009 (gmt 0)|
You probably already thought of this . . . but are your table columns "over-indexed?" Remember indexes speed searches, but can slow down inserts and updates.
The fact that it starts slow and then begins to bog seems to indicate either the program doing it (if you're using one) is not doing so efficiently or it's a server resource issue.
I have one client that regularly flushes and repopulates up to 600K - 1M records in a table, approx. 15-20 fields, mixed types. I use a perl script to do this, and have it report to STDOUT every 1000 lines. It's pretty consistent start to end, but I **do** remember when I first took over the project it had the same problem. We did two things: rebuilt the tables with leaner indexing . . . then dedicated the data to a data server.
| 8:28 pm on Dec 17, 2009 (gmt 0)|
I've been investigating this and finding little to do with the database itself that would cause problems after 1000 pages. I thought perhaps something was getting trashed, but not that I can see.
I'm now going down the road that it's the insert program. I suspect somehow that for every insert, it's doing lookups of all the prior pages (e.g. ensuring that the page isn't already in the database). So for page 2, it does 1 check. For page 1000, it's doing 999 checks. That's the avenue we're going down (though even that seems like it shouldn't cause this much of a delay).
| 7:54 pm on Dec 18, 2009 (gmt 0)|
Oh man . . . that changes things drastically. I had a project similar to this [webmasterworld.com]:
|Two email lists in plain text files, one over a million records, the other from 1000 to 3000 records. Find all addresses that are in both lists and make a third list. |
Process shouldn't take any more than a couple hours, as client's current method can exceed 48 hours. Got it down to 45 minutes, but it involved a combo of text files AND mysql (see last post.)
This is way harder than it seems. Pure mySQL failed miserably, on a dedicated server.
If this is a regular process that needs doing, maybe the inverse of the above is the case - some fields not indexed properly.
It's going to largely depend on the task and the environment, there may be no good way around it.
| 8:02 pm on Dec 18, 2009 (gmt 0)|
I hope it's fixable. It's either mysql, my insertion program, or wordpress. Mysql can probably be fixed, the program can be, if it's wordpress we got a problem.
But it does have to be fixed. I've got a run coming up of 50,000 pages :).
| 5:34 pm on Dec 21, 2009 (gmt 0)|
LOAD DATA INFILE is the way to go...but it's freakin' complicated.
Check your mysql slow queries log for some insights into what is happening....if you're not logging slow queries (you should be!) you can turn it on by editing your my.cnf file and then rebooting the mysql server.