Forum Moderators: martinibuster
use a middle man adserver that load adsense. i thing blekko only pull the source and not exe javascript.
User-agent: ia_archiver
Disallow: /
well the obvious thing to do if you don't block blekko altogether is to just not serve the adsense ads to the blekko bot, pretty straightforward.
FYI, If they don't honor the robots.txt removal in the Wayback Machine you can directly contact them via email and insist on having sites removed, they will do it, I've done it. Just make sure your robots.txt and whatever else they require is set properly before contacting them so they can see it's a valid request.
After all, by continuing to make cached pages publicly available they provide an archival service for content scrapers.
Regarding #3 above, is there any reason why a site owner might want any search engine to archive any URL at all -- assuming the Webmaster stays current with 301 redirects, etc.?
Actually, Blekko implemented meta NOARCHIVE improperly....Not only don't they cache your page, they totally remove it from the index!...Idiots!
All I know is some of the sites are being 100% blocked from my servers yet get the AdSense ID so it's being obtained from a 3rd party service....Still working on figuring out who that 3rd party service is as well, since I have Blekko blocked!
I recommend people use private site archival services instead, there are a few, which can also be referenced when push comes to shove regarding copyright, etc. but they are private and not fodder for scrapers, lawyers, SEOs, etc.