incrediBILL - 11:04 pm on Nov 1, 2012 (gmt 0)
Explain to me how a single line of text that disallows a whole domain is more work than multiple scrapers using dynamic ips?
Can you say hundreds of domains?
My IBLs are in the many hundreds of thousands and I didn't buy a single one, many are legit and many aren't, just too many to weed through in the first place.
Just a single scraper using spun content can plaster links over hundreds of junk domains that come and go quickly, tons of them in my supplemental results.
Sad thing is, before I started bot blocking it was MUCH worse!
Many of them still use old content and some appear to even scrape each other or scrape from other sources so no matter what I do tons of junk still shows up. I have over a 100K pages and trying to track all the bad links from all the sites to all those pages is like trying to herd cats.
I do as much as I can possibly do to stop the scraping in the first place, then I whack the major offenders that by hand now and then, but going after ever single domain that pops up constantly with my content and potentially links would be a part-time job at a minimum. Currently I have automated the detection and blocking of the scrapers that slip through the cracks by putting poison in the content that my bots can locate and add them to my blocking lists to stop further activity so it's possible I could automate the task, but I don't have time to even deal with that at the moment.
TBH, If I obsessed on it more than I already do it would be bordering on a psychosis.
However, automatically detecting and disavowing links from rotten sites could be a useful tool ... maybe I'll move it up the TO-DO list for further feasibility study.