Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Crawl Budget - Googlebot Crawling Rates and Demands

         

engine

3:43 pm on Jan 19, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Googlebot is designed to be a good Netizien so that it doesn't hog resources, or slow a site to degrade a site visitor's experince, says Google's Gary Illyes, in a blog post about Google Crawl Budget.

The blog post goes on to describe what affects the crawl rate, including the performance of a site. Seach Console limits won't necessarily speed up or slow down a crawl, but it'll make googlebot aware if you particularly want to slow down crawling.

Popular urls will be crawled much faster, and the system is devised to avoid staleness of indexed urls.
Factors affecting crawl budget
According to our analysis, having many low-value-add URLs can negatively affect a site's crawling and indexing. We found that the low-value-add URLs fall into these categories, in order of significance:

  • Faceted navigation and session identifiers
  • On-site duplicate content
  • Soft error pages
  • Hacked pages
  • Infinite spaces and proxies
  • Low quality and spam content
    Google Crawl Budget - Googlebot Crawling Rates and Demands [webmasters.googleblog.com]


  • There's an FAQ on there, too, which it's quite clear many will already know, such as whether crawling is a ranking factor. The short answer is no. Any URL on a site represents part of the crawl budget, including AMP urls, CSS, JavaScript, and long redirect chains.

    It's worth a quick read over the blog post.

    Additionally, this earlier document on Crawling and Indexing is still relevant.
    [webmasters.googleblog.com...]

    keyplyr

    3:24 am on Jan 20, 2017 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    It would be nice if all SEs did this. For me however, Googlebot has never been the abuser.

    Currently, Bing is crawling my entire 300 pages every day & asking for many pages several times. Then there's the other 5 or 6 SEs hitting my server several times a week & pounding my image file directories (3k files.)