|Will lowering googlebot's crawl rate affect rankings?|
| 3:43 am on Jul 6, 2012 (gmt 0)|
Just wondering if lowering googlebot's crawl rate will affecting rankings in the SERPs?
For example, the default in WMT is 0.1 request per second and 10seconds between request. If I lowered that to say 0.033 requests per second and 30.303 seconds between requests, what sort of impact would that have?
The reason I'm asking is because apparently, googlebot is causing abnormally high bandwidth use on our server. We have cache enabled using Nginx on our server (according to webhost), and that seems like the googlebot activity is causing our outgoing traffic jump from normal 400kb/s to over 1400kb/s. When they blocked googlebot's ip for 5 minutes, it dropped back down to 400kb/s and then shot back up to 1400kb/s after allowing it again.
Can anyone please make any suggestions on what I should set our googlebot crawl rate to be? What sort of impact would it have? And will it affect rankings?
| 4:35 am on Jul 6, 2012 (gmt 0)|
Also, just to provide some more info, our site has about 1500 or so pages, but in WMT, it's showing that it crawls average 2300 pages per day, with high of 5000+ pages/day and low of 500pg/day. Isn't that kind of excessive considering that we only have about 1500 pages? I did an "allinurl" and it shows 1500 pgs indexed, and that's pretty close to what we have. We did have a forum with a lot of pages, but all the forum pages have been noindexed many months ago.
Also, with 1500 pages in our site, what's an optimal number of pages googlebot should crawl each day? 300, 500, 800? I don't know.... what should I set the crawl rate to be? It's currently on default rate as set by google.
| 4:38 am on Jul 6, 2012 (gmt 0)|
This is interesting and I want to know more also because we have one server that gets maxed out for short periods.
| 4:43 am on Jul 6, 2012 (gmt 0)|
The difference between 1500 and 5,000 might be nothing. The other day I noticed that duplicates were found like...
They point to the same page. One link is from the category list and the other from the page's pagination.
| 4:57 am on Jul 6, 2012 (gmt 0)|
Actually, I compared the our MRTG chart (shows bandwidth use) to our WMT googlebot crawl rate, and there's a huge difference between crawling the average 2000pg/day and peaks of 5000pg/day.
When it crawls 2000pg/day, the MRTG shows outgoing bandwidth at normal 400kb/s, and when it crawls at 5000pg/day, MRTG shows outgoing bandwidth triples to around 1300kb/s. So it's using up a lot of bandwidth.
Can someone knowledgeable about this please make some recommendation on what I should set our Crawl rate to be? How many pages should googlebot be crawling for a 1500page site? and what crawl rate setting should be set at to achieve that?
| 6:05 am on Jul 6, 2012 (gmt 0)|
Bandwidth is not the right metric to be concerned about - server response time is. If googlebot hits are not slowing down your response to other visitors, then there is no ranking problem. Lowering your crawl rate in such a situation might save you bandwidth, but aby decent ISP should be offering enough bandwidth so that googlebot is not going to increase your hosting costs.
The best way to handle googlebot crawling, in my experience, is to let their crawl routines do their thing. There will be highs and lows - just let it be. Now, if you have a lot of canonical errors on your site so that too many "duplicate content" URLs are being fetched, then you need to give the site tome technical attention to fix that problem.
And in some cases, googlebot can have troubles - usually for a short period. But again, I encourage you to find a workable way not to use those crawl rate controls.
There is no ideal "setting per number of pages". How frequently is your content updated? And similarly, how "fresh" are the spaces you are competing in? That also makes a difference.
If freshness is not a major factor, then frequent crawling may not be so important. However, there are many members here who WISH they could get a higher crawl rate, because their indexed content is too stale - and that can have a negative effect on traffic.
So be very careful about taking your crawl rate off auto-pilot. I've almost never seen that help a site. Think "big picture" here.