Page is a not externally linkable
- Search Engines
-- Sitemaps, Meta Data, and robots.txt
---- Indexed pages that are disalowed by robots.txt


KenB - 7:12 pm on Jan 6, 2004 (gmt 0)


Here's my example made a little generic for these purposes:

User-agent: *
Disallow: /cgi-bin/
Disallow: /wiget1.html
Disallow: /wiget2.html
Disallow: /robot.html
Disallow: /bla/stuff.html
Disallow: /links/
Disallow: /googlereplace.html

Problem comes in with /wiget1.html and /widget2.html showing up in SERP.

FYI the entry /robot.html is a honeypot to catch bots that use the robots.txt file to "find" files they aren't supposed to know about. It helps in targeting the bad guys.


Thread source:: http://www.webmasterworld.com/robots_txt/229.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com