Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- Pages are indexed even after blocking in robots.txt


shaunm - 7:47 am on Sep 6, 2012 (gmt 0)


@Shaddows

You use robots.txt to keep Google off your page. It stops them knowing stuff. That's it.

Real-world reasons for employing it include, but are not limited to
- Preserving Crawl budget (CSS files might not need crawling)
- Blocking file directories (/images/)
- Creating bad spider lists (block a directory, link to it in a hidden link, ban anything that finds its way there)
And why do I need to keep them off my pages/files when they can simply ignore the robots.txt and index those pages/files in their SERPs through external, internal links to those pages/files?

I know I got it wrong, but why don't I get the context yet?

Thanks


Thread source:: http://www.webmasterworld.com/google/4490125.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com