homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Is slowing down Googlebot with a custom crawl rate a good idea?

 8:12 am on Apr 17, 2012 (gmt 0)

My logs show that over the past three days Googlebot, with verified IP, has been crawling an average of twice per day per page (e.g. two visits a day per page average). Many of these pages are years old and have no content on them that changes or is likely to change but I suspect Google is looking for new comments. Proper expires headers make no difference on crawl rate apparently.

Is setting a slower crawl rate in GWT a dangerous thing in terms of losing rank? With several thousand pages it's a lot of extra requests. Bing and Yahoo average half as many pages crawled and I see no reason why Googlebot needs to check a page twice daily right now.


Andy Langton

 11:45 am on Apr 17, 2012 (gmt 0)

I doubt you will see any ranking difference, but the best idea is usually just to let Google do its thing.

Typically, a high recrawl rate is a factor of both frequently updated content and a degree of "trust" or "authority" in the site itself, so usually it's a positive sign. Unless bandwidth is an issue, I would leave it be, personally.


 12:00 pm on Apr 17, 2012 (gmt 0)

Google upped its crawl rate on my site in the last few months. It had previously being doing about 250K pages a day and it is now doing around 330K pages a day. However traffic has not jumped and I still get the braindead WMT warning about Googlebot detecting a large number of urls. Not sure if the distributed nature of Google's crawling (a guess here) has any short term effect on recrawling/headers. There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT. As to PR, I don't know about the effect of either a WMT reduction or a robots.txt delay.


Andy Langton

 12:02 pm on Apr 17, 2012 (gmt 0)

The large number of URLs warning is usually an indication that Google has found URLs that it does not believe will lead to useful content - often because of a large number of parameters, or parameters that Google believes won't modify content. I've found that one is usually worth looking into if you haven't already!


 12:07 pm on Apr 17, 2012 (gmt 0)

There is a large number of URLs - over 350 million. The site has the hosting history for domains in com/net/org/biz/info/mobi/asia back to 2000. The people who use it find it useful. I've got the bandwidth throttled though as I am moving it to a bigger server with better connectivity. However I was considering limiting Googlebot on the new server.



 12:08 pm on Apr 17, 2012 (gmt 0)

I received the large number of urls warning when I removed rel=nofollow from faceted navigation on my ecommerce site. Those urls were noindexed but I still received that message and Googlebot was definitely churning through a ton of urls as a result. Might be worth a look jmccormac to see if perhaps you have something similar going on.

Andy Langton

 12:12 pm on Apr 17, 2012 (gmt 0)

There is a large number of URLs - over 350 million

Ah yes, but the implication is that there are a large number of URLs that won't rank - it sounds like that might be reasonable in the context of your site, in which case all may be well. Worth checking a sample of the URLs Google says it doesn't like, perhaps, just to verify.


 12:28 pm on Apr 17, 2012 (gmt 0)

It doesn't appear to be churning through links, serpsup.
Googlebot is crawling unique URLS. The problem for Googlebot is that the data here is on a narrow pipe so it can't download everything in one go. Though with the db behind the site at 162G, that might take some time.

(Edit reason: typos - need to replace this keyboard.)


 8:29 pm on Apr 17, 2012 (gmt 0)

There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT.

If you mean crawl-delay, don't bother. Google ignores it. Or, possibly, googlebot may choose to ignore it if the directive doesn't suit its convenience. Which amounts to the same thing. Says gwt:
Line 15: Crawl-delay: 3   Rule ignored by Googlebot


 2:33 pm on Apr 21, 2012 (gmt 0)

@Sgt_Kickaxe I've aways wondered if sites like yours are turning a good profit :-)

Back to the point, from my own experience Googlebot uses to ignore often the crawl rate chosen on WMT

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved