Greetings, There is strange condition that Googlebot seems to visit the same set of 10-20 pages each day, sometimes two or three times a day. This is on a site with thousands of pages. Is there any specific reason why this might be? All thoughts and ideas are appreciated. Thanks.
The rate at which Googlebot crawls a page seems to be directly proportional with the pagerank of the page. There appear to be several distinct modes of crawling for Googlebot:
1) Crawling new pages. When you launch many new pages, Googlebot will crawl lots of pages in this new section. The depth to which Googlebot crawls depends on the pagerank of the page(s) linking into that section. If you link into the new section from a PR 5 page, I would expect googlebot to crawl a couple thousand new pages in this mode.
2) Return crawling. After the initial crawl of pages, Googlebot will return and recrawl the pages. The frequency with which it returns is proportional to the pagerank of the page. Here is my rough estimate of frequency by pagerank. PR 6 - Multiple times a day PR 5 - Once a day PR 4 - Once every two days PR 3 - Once every three days PR 2 - Once a week PR 1 - Once a month
3) Validity checking old urls. Occasionally Googlebot seems to find a dusty box in the attic containing dis-used urls on my site. Typically these are retired urls that now redirect or 404 and no longer have any links anywhere on the web. Google may crawl 1000 of these or more in a batch, typically crawling the shortest urls in the batch first.