Page is a not externally linkable
lucy24 - 8:07 am on Sep 3, 2012 (gmt 0)
If the URL is in your sitemap, the page will be crawled.
Are you sure that even though I may block a web page using noindex meta tags, the page will still be indexed if the URL has been included in the SITEMAP?!?
Careful. The whole point of this thread, and all those related ones, is that CRAWL and INDEX are different things.
Google will CRAWL any page that isn't blocked in robots.txt, even if the page is labeled "noindex".
It will INDEX any page that isn't flagged "noindex", even if it can't crawl the page and therefore has no idea what's on it. It will also INDEX any page that is flagged "noindex"-- if, again, it can't crawl the page and therefore can't see the "noindex".
Next question: If a page's meta tags say both "noindex" and "nofollow", will g### still crawl the entire page from top to bottom? What excuse does it have? (Maybe that's a non-question. I don't know if the googlebot even has an "off" switch that would let it stop crawling before it reaches the bottom of a page.)