Forum Moderators: Robert Charlton & goodroi
But I'm guessing that this phenomenon is more correlation than causation
Remember that every URL has its own PageRank score, so PR is not something that a "site" has.But a site as a whole does have a crawl budget, at least that's my understanding of the current conventional wisdom. I do remember there was talk about it and also about "using it wisely" as in not creating too many bad URLs that Gbot crawls only to realize that that those are not the URLs to index.
Then they put those URLs in a crawl list which is prioritized by a complex algorithm.Complex alphabet sorting algorithm ;) LOL.
how would individual ranks of pages explain alphabetized carpet-crawing?
From what I've learned, googlebot does not usually crawl a site by following links the way a human user would. Instead, the crawl team LEARNS about other URLs by indexing pages. Then they put those URLs in a crawl list which is prioritized by a complex algorithm. The most common crawl is one where googlebot is "given its orders" from the beginning - a list of URLs to crawl on the site.
[edited by: tedster at 5:31 am (utc) on Dec 11, 2012]
I have never personally seen GoogleBot crawl in alphabetical order, but I have seen it crawl in url-length order.@deadsea: good catch! I forgot to mention that one! Also, a combined alphabetical AND URL length at the same time - downright creepy. Happens on forum sites all the time and easier to catch when it gets stuck on slight variations to some very popular topic that comes up over and over. I convert URLs into readable form, same as title but shortened to 60 symbols or less. Perhaps I get caught up in this alpha/length ranges because my URLs are less diverse in size than normal 'cause I cut down all long ones to exactly 60 chars.