Welcome to WebmasterWorld Guest from 50.16.107.222

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Is slowing down Googlebot with a custom crawl rate a good idea?

     
8:12 am on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member

joined:Apr 14, 2010
posts:3169
votes: 0


My logs show that over the past three days Googlebot, with verified IP, has been crawling an average of twice per day per page (e.g. two visits a day per page average). Many of these pages are years old and have no content on them that changes or is likely to change but I suspect Google is looking for new comments. Proper expires headers make no difference on crawl rate apparently.

Is setting a slower crawl rate in GWT a dangerous thing in terms of losing rank? With several thousand pages it's a lot of extra requests. Bing and Yahoo average half as many pages crawled and I see no reason why Googlebot needs to check a page twice daily right now.
11:45 am on Apr 17, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3332
votes: 140


I doubt you will see any ranking difference, but the best idea is usually just to let Google do its thing.

Typically, a high recrawl rate is a factor of both frequently updated content and a degree of "trust" or "authority" in the site itself, so usually it's a positive sign. Unless bandwidth is an issue, I would leave it be, personally.
12:00 pm on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2529
votes: 47


Google upped its crawl rate on my site in the last few months. It had previously being doing about 250K pages a day and it is now doing around 330K pages a day. However traffic has not jumped and I still get the braindead WMT warning about Googlebot detecting a large number of urls. Not sure if the distributed nature of Google's crawling (a guess here) has any short term effect on recrawling/headers. There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT. As to PR, I don't know about the effect of either a WMT reduction or a robots.txt delay.

Regards...jmcc
12:02 pm on Apr 17, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3332
votes: 140


The large number of URLs warning is usually an indication that Google has found URLs that it does not believe will lead to useful content - often because of a large number of parameters, or parameters that Google believes won't modify content. I've found that one is usually worth looking into if you haven't already!
12:07 pm on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2529
votes: 47


There is a large number of URLs - over 350 million. The site has the hosting history for domains in com/net/org/biz/info/mobi/asia back to 2000. The people who use it find it useful. I've got the bandwidth throttled though as I am moving it to a bigger server with better connectivity. However I was considering limiting Googlebot on the new server.

Regards...jmcc
12:08 pm on Apr 17, 2012 (gmt 0)

New User

5+ Year Member

joined:Aug 18, 2011
posts: 26
votes: 0


I received the large number of urls warning when I removed rel=nofollow from faceted navigation on my ecommerce site. Those urls were noindexed but I still received that message and Googlebot was definitely churning through a ton of urls as a result. Might be worth a look jmccormac to see if perhaps you have something similar going on.
12:12 pm on Apr 17, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3332
votes: 140


There is a large number of URLs - over 350 million


Ah yes, but the implication is that there are a large number of URLs that won't rank - it sounds like that might be reasonable in the context of your site, in which case all may be well. Worth checking a sample of the URLs Google says it doesn't like, perhaps, just to verify.
12:28 pm on Apr 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2529
votes: 47


It doesn't appear to be churning through links, serpsup.
Googlebot is crawling unique URLS. The problem for Googlebot is that the data here is on a narrow pipe so it can't download everything in one go. Though with the db behind the site at 162G, that might take some time.

Regards...jmcc
(Edit reason: typos - need to replace this keyboard.)
8:29 pm on Apr 17, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13268
votes: 363


There was also a robots.txt option to delay Googlebot but that might be a bit more efficient than WMT.

If you mean crawl-delay, don't bother. Google ignores it. Or, possibly, googlebot may choose to ignore it if the directive doesn't suit its convenience. Which amounts to the same thing. Says gwt:
Line 15: Crawl-delay: 3   Rule ignored by Googlebot
2:33 pm on Apr 21, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Sept 11, 2009
posts: 139
votes: 0


@Sgt_Kickaxe I've aways wondered if sites like yours are turning a good profit :-)

Back to the point, from my own experience Googlebot uses to ignore often the crawl rate chosen on WMT
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members