Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- No incoming links = reduced crawl rate?


deadsea - 5:13 am on Dec 11, 2012 (gmt 0)


From what I've learned, googlebot does not usually crawl a site by following links the way a human user would. Instead, the crawl team LEARNS about other URLs by indexing pages. Then they put those URLs in a crawl list which is prioritized by a complex algorithm. The most common crawl is one where googlebot is "given its orders" from the beginning - a list of URLs to crawl on the site.


When you launch large sections of new content, Googlebot most certainly follows links within that section before the pages are indexed. I've done some experiments where I have launched chains of pages where each page links to the next. Googlebot has marched down this chain, one page after another for each of the 1000 pages I had in the chain on my site. All well before the pages were indexed. I call this behavior "Fresh Googlebot".

After this initial crawl where Google follows all the links it cand find and greedily crawls the new pages, it reverts to crawling based on PageRank. Pages with higher PageRank are re-crawled more frequently. I call this behavior "PageRank Googlebot".

I have never personally seen GoogleBot crawl in alphabetical order, but I have seen it crawl in url-length order. For me this happens for a batch of old urls that have no current inbound links but which existed on the site at one point in time. (In my case they all 301 redirect to new urls now.) Googlebot will often crawl 1000 of these pages in a sitting, one right after the other, in url-length order starting with the shortest urls. I call this behavior "Stale Googlebot".

[edited by: tedster at 5:31 am (utc) on Dec 11, 2012]


Thread source:: http://www.webmasterworld.com/google/4526652.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com