Forum Moderators: open

Message Too Old, No Replies

sql server processes hanging

on batch updating...

         

macrost

2:16 pm on Aug 26, 2004 (gmt 0)

10+ Year Member



I am on a shared sql server and I need to batch update about 54,000 some odd records. A one record update works just fine, but when I start the whole script, then processes hang in sql server and block one another. Specifically the select statment where I grab all of the id's, and then the update statement.

The select statement always blocks the update statements.
Is there something obvious that I'm missing?

duckhunter

5:17 pm on Aug 26, 2004 (gmt 0)

10+ Year Member



First add the "with (nolock)" hint to your select. When deleting a large number of rows, it's best to use a cursor to cycle through the records to be deleted.

I posted an example here: [webmasterworld.com ]

Lord Majestic

7:00 pm on Aug 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



err, using cursor for performance SQL?!?!?

If you want to update X records on a live table which will be heavily used (read) all the time then I'd suggest some options:

1) Update in batches of Y rows (100-500)

declare @counter int,@step int,@totalrows int,@updatetime datetime

select @counter=0,@step=100,@totalrows=54000,@updatetime=getdate()

set rowcount @step

while @counter<@totalrows
begin

select @counter=@counter+@step

-- due to row count only @step rows will be updated
update tableX set UpdateableField=1,LastUpdated=@updatetime
where LastUpdated<@updatetime

end

set rowcount 0

2) create 2 identical tables only one of which will be used as a live table and the other will be used to update data in any way you want. Once you updated second table you will "switch" it to be live. We used to do that by using value stored in other table with options.

3) use dirty reads on select (if your db supports it) - this will read data from locked rows/pages (and I think even locked tables), but its dangerous because values that will be read might be changed one second later - certainly not recommended for anything related to money matters!

digitalv

7:24 pm on Aug 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Can you give us a little info about how your database is set up? 54,000 records shouldn't cause a properly indexed database to hang, even if you're updating a ton of fields at once. Need to know the following:

(1) Table setup (field names, field type)
(2) Which field is the primary key?
(3) Which fields are indexed?
(4) Are there any relationships to other tables?

If you'll give me this information and show me your update statement, I'll find a solution :)

duckhunter

8:19 pm on Aug 26, 2004 (gmt 0)

10+ Year Member



using cursor for performance SQL

Performance didn't seem to be macrost's problem. I'm assuming he was running the update once not frequently considering it's 54K rows.

Yes, there are other options that require a little more coding and planning but the cursor update approach is quick to implement and should eliminate the blocking problem.

>>2 Identical Tables?!?!?! Sounds like a recipe for data integrity problems and slower performance than a cursor update. That should not be necessary unless you need ALL the data to appear updated simultaneously.

Lord Majestic

8:37 pm on Aug 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but the cursor update approach is quick to implement and should eliminate the blocking problem.

Perhaps something changed but traditionally on Sybase (and Microsoft licensed their code until full rewrite in MS SQL Server 2000 (or a bit ealier?) but they still keep TSQL) usage of cursors was just an invitation for disaster because they lock far more often than granular updates in batches like ones I shown above due to their nature - the whole concept of cursors is alien to relational database. Cursors were created to make it easier for programmers to do SQL because it was easier for them to operate in "row after row" mindset rather than "do it with sets" way.

There is no need for cursors because anything they do can be implemented in TSQL with better efficiency - it might take more time to code it (and get used to that), but performance will be better. Our DBA was excellent guy who worked for blue chip banks in the City (UK's equivalent of Wall Street) and they were real hardcore on performance SQL - cursors were pretty much banned.

Sounds like a recipe for data integrity problems and slower performance than a cursor update.

Nope - this is how many major sites with a LOT of data operate in order to stay fully available 24/7. You only need to do it for main product lines table, or search engine table or whatever is being significantly updated. We had it on our site and it worked like a treat despite multi-million rows full reloads that we can do at any time (as opposed offpeak only). Its overkill for small sites however, and 54k rows is small.

duckhunter

8:41 pm on Aug 26, 2004 (gmt 0)

10+ Year Member



>>54k rows is small

Yes, it is. No need to bend over backwards for that small of an update.

Digitalv is on the track. One question to ask is why is 54K rows taking so long to update? Is your table indexed on the conditional where fields in the update statement.

Lord Majestic

8:50 pm on Aug 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah, in addition to this check the following:

1) that updated columns are of fixed length (otherwise updates might not be "in place" - fastest option)

2) no funny triggers that do lots of work on every update/insert

3) no excessive indexing on updated columns

4) transaction log is not too small

5) disk(s) are defragmented

macrost

9:41 pm on Aug 26, 2004 (gmt 0)

10+ Year Member



Ok, sorry, I was working on the problem all day and have the bug worked out. I believe it was the 54K round trips to the sql server that caused it to piss on me.

I used adLockBatchOptimistic in a recordset.