tedster - 5:32 pm on May 30, 2010 (gmt 0) [edited by: tedster at 5:36 pm (utc) on May 30, 2010]
If they don't crawl them, why are there so many URI only listings when performing site: searches
Definition: crawl = request the file from the server. Only server logs can tell you what files were crawled.
URI-only listings are not evidence that the document was crawled, only that the existence of the URL is known to Google.
And the concept of URI discovery brings me to a criticism of the crawling pattern John Mueller tweeted about. Surely Google has a record that a URI was not previously crawled. In that kind of case, why isn't a new check of robots.txt mandatory?
[edited by: tedster at 5:36 pm (utc) on May 30, 2010]