incrediBILL - 1:31 am on Nov 9, 2012 (gmt 0)
The problem with only blocking data centers after you see activity is that there may be some very clever stealth activity, particularly from high profile scrapers and commercial data miners. You don't see it as anything obvious in your logs and it's carefully crafted not to set off any alarms but some of my automated tools have snared many of them for a variety of reasons. Then the cows are out of the barn, your pages are already in the hands of scrapers and you're left with the problem of reacting to the mess they make time consuming crud like DMCAs and all that nonsense.
The only proactive thing I do to make life simpler is to put tracking bugs in the pages so one simple search query will spit out all the pages copied that were republished, even if the content is scrambled to avoid copyscape, so I can see where it landed and be able to connect the dots on how it got there.