Welcome to WebmasterWorld Guest from 54.162.160.202

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Is slowing down Googlebot with a custom crawl rate a good idea?

     
8:12 am on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member

joined:Apr 14, 2010
posts:3169
votes: 0


My logs show that over the past three days Googlebot, with verified IP, has been crawling an average of twice per day per page (e.g. two visits a day per page average). Many of these pages are years old and have no content on them that changes or is likely to change but I suspect Google is looking for new comments. Proper expires headers make no difference on crawl rate apparently.

Is setting a slower crawl rate in GWT a dangerous thing in terms of losing rank? With several thousand pages it's a lot of extra requests. Bing and Yahoo average half as many pages crawled and I see no reason why Googlebot needs to check a page twice daily right now.
11:45 am on Apr 17, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3292
votes: 117


I doubt you will see any ranking difference, but the best idea is usually just to let Google do its thing.

Typically, a high recrawl rate is a factor of both frequently updated content and a degree of "trust" or "authority" in the site itself, so usually it's a positive sign. Unless bandwidth is an issue, I would leave it be, personally.
12:00 pm on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2449
votes: 32


Google upped its crawl rate on my site in the last few months. It had previously being doing about 250K pages a day and it is now doing around 330K pages a day. However traffic has not jumped and I still get the braindead WMT warning about Googlebot detecting a large number of urls. Not sure if the distributed nature of Google's crawling (a guess here) has any short term effect on recrawling/headers. There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT. As to PR, I don't know about the effect of either a WMT reduction or a robots.txt delay.

Regards...jmcc
12:02 pm on Apr 17, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3292
votes: 117


The large number of URLs warning is usually an indication that Google has found URLs that it does not believe will lead to useful content - often because of a large number of parameters, or parameters that Google believes won't modify content. I've found that one is usually worth looking into if you haven't already!
12:07 pm on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2449
votes: 32


There is a large number of URLs - over 350 million. The site has the hosting history for domains in com/net/org/biz/info/mobi/asia back to 2000. The people who use it find it useful. I've got the bandwidth throttled though as I am moving it to a bigger server with better connectivity. However I was considering limiting Googlebot on the new server.

Regards...jmcc
12:08 pm on Apr 17, 2012 (gmt 0)

New User

joined:Aug 18, 2011
posts: 26
votes: 0


I received the large number of urls warning when I removed rel=nofollow from faceted navigation on my ecommerce site. Those urls were noindexed but I still received that message and Googlebot was definitely churning through a ton of urls as a result. Might be worth a look jmccormac to see if perhaps you have something similar going on.
12:12 pm on Apr 17, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3292
votes: 117


There is a large number of URLs - over 350 million


Ah yes, but the implication is that there are a large number of URLs that won't rank - it sounds like that might be reasonable in the context of your site, in which case all may be well. Worth checking a sample of the URLs Google says it doesn't like, perhaps, just to verify.
12:28 pm on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2449
votes: 32


It doesn't appear to be churning through links, serpsup.
Googlebot is crawling unique URLS. The problem for Googlebot is that the data here is on a narrow pipe so it can't download everything in one go. Though with the db behind the site at 162G, that might take some time.

Regards...jmcc
(Edit reason: typos - need to replace this keyboard.)
8:29 pm on Apr 17, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13058
votes: 299


There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT.

If you mean crawl-delay, don't bother. Google ignores it. Or, possibly, googlebot may choose to ignore it if the directive doesn't suit its convenience. Which amounts to the same thing. Says gwt:
Line 15: Crawl-delay: 3   Rule ignored by Googlebot
2:33 pm on Apr 21, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Sept 11, 2009
posts: 131
votes: 0


@Sgt_Kickaxe I've aways wondered if sites like yours are turning a good profit :-)

Back to the point, from my own experience Googlebot uses to ignore often the crawl rate chosen on WMT