Page is a not externally linkable
- Code, Content, and Presentation
-- Databases
---- Massive duplicate data search


TheMadScientist - 4:56 pm on Mar 24, 2011 (gmt 0)


COUNT(id)

1.) Add a non-unique index as suggested.

2.) MySQL counts are optimized for count(*) if I remember correctly (double check), but I think I remember reading the count of indexed data is optimized for a count(*) over a specific col. (I don't remember all the ins and outs, but I know for certain queries related to counts it is -or was- better to use the * than something more specific because of the way an index is created.)

3.) Chunk if necessary, as coopster suggests.

I've looped through text before and actually had a low timeout to work with, so I set the loop to where it would complete in slightly less than the timeout limit reliably, put it on an HTML page with a meta refresh that fired at slightly less than the timeout, opened it in a browser window and let it run.

I cut down manual refreshes from once every timeout to once every 30 minutes or so.


Thread source:: http://www.webmasterworld.com/databases_sql_mysql/4270978.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com