Forum Moderators: coopster & phranque

Message Too Old, No Replies

Optimize DELETE for speed

Best method for executing?

         

lorax

7:36 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've got a calendar of events in a MySQL db. I want my events page to check for events older than 7 days and delete them from the db. There aren't a lot of events and since the page sees a fair amount of traffic, I don't expect more than a single day's worth of events to be deleted at a time (which equates to less than 12 records).

The question is should I bother with a record count i.e., SELECT count(*) to see if there're any events to delete or should I just go ahead with the delete command. Is there any advantage one over the other or is there a better way to execute this?

jatar_k

8:14 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Think scalable. Put the count in there, you never know how many records there could be somewhere down the road and if there are none there really isn't any reason to run the delete.

lorax

8:18 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> Think scalable.

That's a valid point. I'm curious how many records would need to be deleted before I noticed a performance hit. I know count(*) is quick but how quick?

jatar_k

8:19 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I think you'd have to benchmark it, I have no clue. It is definitely very fast.

lorax

8:27 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'll do that.

lorax

9:55 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, I did a quick benchmark (not scientifically conducted) for those of you following this thread.

I added 1000 records of average size (all the same). Ran a count - timed the code and then ran a delete - timed that code.

The count took 0.008831 seconds
The delete took 0.421286 seconds

Nearly a half second for 1000 records.

jatar_k

11:26 pm on Nov 6, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I guess that means, do the count and deleting only if needed.

lorax

2:54 am on Nov 7, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That would be a safe assumption Jatar_K but the clincher was when I ran the same test with no data to delete. I didn't write down the numbers but I still saved time though it was much less.

amoore

5:51 am on Nov 8, 2002 (gmt 0)

10+ Year Member



Just to be a devil's advocate I will recommend that you don't count the records first. Let's look at a few cases to see why:

- There are no records: In this case, surely you can spare the fraction of a second longer that it may take to delete no records compared to counting zero records. This amount of time will never grow, so you can always depend on being able to deal with it.

- There are a ton of records: Do you really want to run through that table twice? You're going to have to delete them, so why count at all? Plus, this option grows with your data, so you can't really assume that it's a tolerable amount of time to wait.

Therefore, I recommend that you don't count the records first.

That being said, I will recall for you one piece of advice: optimize for the case that is expected to be the most common.

Hope it helps.

lorax

1:40 pm on Nov 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Another good point - thanx amoore.

Thinking outloud here:

I haven't changed it yet simply because I was wondering the same thing - actually waffling. I did review my indexes to see if I had them set up right because indexes will significantly affect performance. Since we're talking events - the dates are indexed and that fits the needs of the delete code.

My test was based on 1000 records and I saved roughly 0.4 seconds. We're talking a small community calendar here. There aren't that many events in a year let alone a single day. But if I were to port the code to another website where I use it to, let's say, transfer completed orders from a 'Pending Orders' table to a 'Completed Orders' table, I could conceivably have records in the thousands.

All that being said, though logically I can see your point amoore, I have to wonder two things: did I test the count(*) correctly (is there more of a time savings than I think I found?) and why was count created - what did the good folks at MySQL have it earmarked for?

ukgimp

4:29 pm on Nov 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Perhaps this is a stupid question but I have been looking at this delete process and wondering whether there is a difference between the two following scenarios wrt to performance or speed. I can get 1 to work no worries but should I be using 2 in terms of performance.

1.
"SELECT X FROM Y WHERE Z";
$Result=mysql_query( $sql , $db_connection );
$number = mysql_num_rows($Result);
if($number > 0)
{
do delete
}

2.
"SELECT COUNT(*) FROM X WHERE Z";
$Result=mysql_query( $sql , $db_connection );
######
this bit eludes me, how do I use $Result (my stupid bit)
######
if.....statement

Cheers