Forum Moderators: bakedjake
<title>The New York Times - Breaking News, World News & Multimedia</title>
<meta name="robots" content="noarchive,noodp,noydir">
<meta name="description" content="Find breaking news, multimedia, reviews & opinion on Washington, business, sports, movies, travel, books, jobs, education, real estate, cars & more.">
<head>
<title>The New York Times - Breaking News, World News & Multimedia</title>
<link rel="alternate" type="application/rss+xml" title="RSS" href="http://www.nytimes.com/services/xml/rss/nyt/HomePage.xml"> noarchive content is highly correlated with spam, bait & switch paywalls, and other user-unfriendly material.
From our perspective, noarchive content is highly correlated with spam, bait & switch paywalls, and other user-unfriendly material. We believe that omitting this content will result in an overall quality boost in our index.
This is not correct. blekko does not alter the cached view in any way. It shows the exact bytes that we received at the time of crawl.
noarchive is not a panacea for scraping
[edited by: incrediBILL at 9:54 pm (utc) on Jan 2, 2011]
We've corrected this, so going forward blekko will treat any meta noarchive pages it encounters as meta noindex, and will not index them. This will take a little time before it is pushed to our production servers and makes it into our indices, so please be patient.
We've corrected this, so going forward blekko will treat any meta noarchive pages it encounters as meta noindex, and will not index them. This will take a little time before it is pushed to our production servers and makes it into our indices, so please be patient.
[edited by: incrediBILL at 11:55 pm (utc) on Jan 2, 2011]
Seriously, why do I care anyway?
Blekko is probably wise not to even deal with that mess and treat noarchive as noindex. This may well be a decision made by the legal team.
There is no confusion. You use noarchive on Blekko and it is the same as noindex.
Why should these guys come along and do something different?
[edited by: TheMadScientist at 11:30 am (utc) on Jan 4, 2011]