incrediBILL - 11:30 am on Oct 13, 2010 (gmt 0)
Timely topic as PubCon has a session on this very hot topic, again, but it's the last day so I hope people stick around and pay attention this time.
Funny thing is most people say they don't care about scrapers until it hits the fan and by then the damage is done.
The emerging business of web scraping provides some of the raw material for a rapidly expanding data economy. Marketers spent $7.8 billion on online and offline data in 2009, according to the New York management consulting firm Winterberry Group LLC. Spending on data from online sources is set to more than double, to $840 million in 2012 from $410 million in 2009.
That's why people scrape - money.
If you stop them from scraping they *may* have to share the wealth to get what they want.
Very simple reason why my sites have legit bots whitelisted so all else like "80legs" in their article get the bounce. NOARCHIVE is used to prevent scraping SE cache and internet archives are disabled. Plus a whole lot more.
Sorry, my job isn't to feed leeches, my resources aren't for being leeched, and if scrapers go broke tomorrow it won't be a day too soon.