topr8 - 11:40 am on Nov 7, 2010 (gmt 0)
Many sites go to great lengths to prevent scrapers from stealing their content.
These same sites also generally prevent the major search engines from cache-ing their pages, using the robots noarchive tag
<meta name="robots" content="noarchive">
notice how WebmasterWorld doesn't have a 'cached' link in the SERPS, this is an example of noarchive in use, all the major search engines support it.
the reason being, is that a search engine cache is a well known backdoor for scrapers, who can scrape your content through their cache instead of directly from your site.
however blekko, the new search engine, has decided that it will not respect the noarchive tag.
I approached blekko to ask them about this and Robert Saliba, of Blekko Inc said :
"we think that the meta noarchive tag is counter to providing our users with transparent information
regarding the ranking and display of search results."
luckily though, for web admins who do use the noarchive tag, he had a solution, as he also said this...
"We also want to respect the wishes of website administrators. Accordingly,
we are making changes so that In the future, we will treat the meta
noarchive tag as a meta noindex tag."