pageoneresults - 6:07 pm on May 30, 2010 (gmt 0)
No. The URLs disallowed in robots.txt are not crawled.
Time for me to get edumucated. ;)
Okay, if they are not crawled what is the proper terminology when Googlebot requests the robots.txt file and takes action on the directives?
Definition: crawl = request the file from the server. Only server logs can tell you what files were crawled.
Understood. Googlebot comes to the server, requests robots.txt and then does what? It takes all those Disallowed directives back home with it, right? Then what does it do with them? Does it make URI only entries in the index? Or is that an action from Googlebot discovering the URI elsewhere?
URI-only listings are not evidence that the document was crawled, only that the existence of the URL is known to Google.
I need more literal definitions of crawling, indexing, parsing. I always thought crawling was the bot requesting the file and performing actions based on the directives in that file?
When a bot crawls a robots.txt file, particularly Googlebot, what is it doing with the Disallow entries?