JeffOstroff - 4:48 pm on Dec 25, 2012 (gmt 0)
I think they are scraping our site directly already, not via internet archive. They have scrpaed pages before they ever appeared in ininternet archive, that's how I know. We ranked high on lots of popular keywords, so they come to the sites that are ranking high and steal their content. We also get thousands of those Alexa wannabe web sites, that post useless whois and ranking data as an excuse to post your title, description, and a few paragraphs scraped from your site.
Anyway, you guys all mentioned about back ups, I keep backups too, but you have a flaw in your theory. These backups won't help you with your DMCA notices to the web hosts, as they either want to confirm the content is on your site now, or they want an independent 3rd party snapshot to prove your content was there first. Thatís where Wayback has helped us in shutting down several hundred sites in the last few months. Think of it as a necessary evil. Your fears about the IA robot seem more conspiracy theory than actual practice. They steal my blog entries off my site that are not even on the wayback archive. Iím more concerned about my own site being the source of scraping than the wayback.
Also guys, just because you have backups of your site 2 years ago on your pc does not in anyway prove to the web host that you are the copyright owner. I don't understand where you're coming from when you say just use your local pc backups. We hit brick walls when we cannot show somewhere online where our copyright content currently exists.
GoDaddy for example won't even accept wayback machine screen shots! You have to supply them with a URL on YOUR SITE that currently shows the same content on your site that the scrapers stole from you. If you cannot produce it, too bad, the scraper site stays up, with your stolen content that you cannot prove was yours simply because it is no longer on your site. So you have it backed up on your PC? So What, they donít care.