Page is a not externally linkable
jdMorgan - 3:52 am on Jan 3, 2007 (gmt 0)
Well-behaved robots, including those from major search engines, will not fetch a page if it is Disallowed in a properly-formatted robots.txt file. Robots.txt was originally conceived as a way for Webmasters to prevent robots from consuming excess bandwidth, and to keep them from executing cgi scripts. However, now that the Web has gone commercial, there are many other good reasons to Disallow spiders from fetching various URLs. A second control mechanism exists in the HTML <meta name="robots" content"noindex"> tag; Its function is different, and the file containing it must not be Disallowed in robots.txt, or the robots won't be able to fetch it to "read" it. See www.robotstxt.org and w3c.org for authoritative information. Jim
Robots will go through each and every page on your website they can find wether you want them to or not. The robots.txt file simply tells the spider not to save a copy of things listed on the robots.txt file and not to add those pages to the indexes.