Welcome to WebmasterWorld Guest from 54.226.62.251

Message Too Old, No Replies

Is slowing down Googlebot with a custom crawl rate a good idea?

     

Sgt_Kickaxe

8:12 am on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member



My logs show that over the past three days Googlebot, with verified IP, has been crawling an average of twice per day per page (e.g. two visits a day per page average). Many of these pages are years old and have no content on them that changes or is likely to change but I suspect Google is looking for new comments. Proper expires headers make no difference on crawl rate apparently.

Is setting a slower crawl rate in GWT a dangerous thing in terms of losing rank? With several thousand pages it's a lot of extra requests. Bing and Yahoo average half as many pages crawled and I see no reason why Googlebot needs to check a page twice daily right now.

Andy Langton

11:45 am on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I doubt you will see any ranking difference, but the best idea is usually just to let Google do its thing.

Typically, a high recrawl rate is a factor of both frequently updated content and a degree of "trust" or "authority" in the site itself, so usually it's a positive sign. Unless bandwidth is an issue, I would leave it be, personally.

jmccormac

12:00 pm on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google upped its crawl rate on my site in the last few months. It had previously being doing about 250K pages a day and it is now doing around 330K pages a day. However traffic has not jumped and I still get the braindead WMT warning about Googlebot detecting a large number of urls. Not sure if the distributed nature of Google's crawling (a guess here) has any short term effect on recrawling/headers. There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT. As to PR, I don't know about the effect of either a WMT reduction or a robots.txt delay.

Regards...jmcc

Andy Langton

12:02 pm on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The large number of URLs warning is usually an indication that Google has found URLs that it does not believe will lead to useful content - often because of a large number of parameters, or parameters that Google believes won't modify content. I've found that one is usually worth looking into if you haven't already!

jmccormac

12:07 pm on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is a large number of URLs - over 350 million. The site has the hosting history for domains in com/net/org/biz/info/mobi/asia back to 2000. The people who use it find it useful. I've got the bandwidth throttled though as I am moving it to a bigger server with better connectivity. However I was considering limiting Googlebot on the new server.

Regards...jmcc

serpsup

12:08 pm on Apr 17, 2012 (gmt 0)



I received the large number of urls warning when I removed rel=nofollow from faceted navigation on my ecommerce site. Those urls were noindexed but I still received that message and Googlebot was definitely churning through a ton of urls as a result. Might be worth a look jmccormac to see if perhaps you have something similar going on.

Andy Langton

12:12 pm on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There is a large number of URLs - over 350 million


Ah yes, but the implication is that there are a large number of URLs that won't rank - it sounds like that might be reasonable in the context of your site, in which case all may be well. Worth checking a sample of the URLs Google says it doesn't like, perhaps, just to verify.

jmccormac

12:28 pm on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It doesn't appear to be churning through links, serpsup.
Googlebot is crawling unique URLS. The problem for Googlebot is that the data here is on a narrow pipe so it can't download everything in one go. Though with the db behind the site at 162G, that might take some time.

Regards...jmcc
(Edit reason: typos - need to replace this keyboard.)

lucy24

8:29 pm on Apr 17, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT.

If you mean crawl-delay, don't bother. Google ignores it. Or, possibly, googlebot may choose to ignore it if the directive doesn't suit its convenience. Which amounts to the same thing. Says gwt:
Line 15: Crawl-delay: 3   Rule ignored by Googlebot

rlopes

2:33 pm on Apr 21, 2012 (gmt 0)

5+ Year Member



@Sgt_Kickaxe I've aways wondered if sites like yours are turning a good profit :-)

Back to the point, from my own experience Googlebot uses to ignore often the crawl rate chosen on WMT
 

Featured Threads

Hot Threads This Week

Hot Threads This Month