Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- Pages are indexed even after blocking in robots.txt


Robert_Charlton - 7:39 am on Sep 6, 2012 (gmt 0)


Shaddows - Thanks for your extreme precise and clear restatement of how this works, including this general principle, which is key to this ongoing discussion....

This is one of the areas where precise terminology is key. However, the vast majority of casual conversations (and indeed some official resources) tend to be quite lax.

lucy24 asks...
Would not a person of ordinary intelligence interpret this to mean that a file in a roboted-out directory will stay out of the index, once removed?

The dilemma, lucy, is that there's a difference between a page/file and a reference (url or link) to that page/file. Here's a very complete discussion, from back in May, 2010....

robots.txt - Google's JohnMu Tweets a tip
http://www.webmasterworld.com/google/4143083.htm [webmasterworld.com]

As you'll see, robots.txt has its detractors, and vocabulary needed to be clarified.

In the discussion, I link to a comment from GoogleGuy (Matt Cutts), worth noting again here. It's from back in 2003, when I first encountered the problem...

GoogleGuy, with my emphasis added...
If we have evidence that a page is good, we can return that reference even though we haven't crawled the page.

I was as outraged then as you are now, lucy, and I've seen occasional flare-ups by others over the years.

PS....
Could you please tell me WHEN/WHY DO I NEED A ROBOTS.TXT, THEN? I beg, could anyone please explain it to me precisely?

robots.txt will keep the contents of the pages out of the index.

But if you absolutely want to keep the references/urls/links out of the serps, then robots.txt may not suffice. If you don't mind the urls which might (or might not) show in the serps, then tedster's suggestion of how to use robots.txt is just fine. Whether the links will show depends on whether there are unblocked links to these pages existing somewhere on the web.


Thread source:: http://www.webmasterworld.com/google/4490125.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com