|Client tells me to ban Googlebot on all of his sites|
It's rather sad, and over reacting IMO, but a hosting client of mine asked me today to ban all Googlebot IPs from his sites (7 in all).
From a certain perspective, I can see his dilema. For the last 2 months, his sites have went well over their bandwidth allotments. But 95% of that traffic has been all Googlebot (one IP or another). As a matter of fact, from analysis of his log files (done at his request), it seems that the more Googlebot crawls his sites, the less traffic he gets from Google.
I have reviewed all of his sites. No spam, all original content. Not a lot of cross-linking ... Seems like he is doing things right. But I was unable to explain to him why Googlebot had been crawling so much, yet adding so few pages to the index.
A better solution might be to restrict the areas or number of pages that Google can crawl using robots.txt or on-page meta robots tags.
Also, make sure that his Last-Modified, Expires, and Cache-control response headers are configured correctly.
This kind of suicide is usually preventable...
Thanks for the advice!
I really don't want to do this, as he is just upset about the extra bandwidth charges. Maybe I'll try to give him a break.
There are people who'd metaphorically ive their right arm to see the Googlebot every now and then. Even once a month would do.
I'm happy that Google presents search results as it wishes - but I find it hypocritical of Google to make such judgements about a site without even spidering it every now and then.
The data it has about some of my sites is months old.
Remember that the customer is always right. At the end of the day it is your right to charge extra bandwidth, and his right to ask that GoogleBot be banned.
|Remember that the customer is always right. |
The customer is very often wrong...
If a client has a problem, he/she will often have a suggestion for a solution based on inadequate knowledge and/or understanding of the issues. Ultimately, you may have to implement a client's request/suggestion but if you believe it to be wrong you should always brief the client on alternatives first.
Have you tried a sitemap that specifies that those pages aren't changing nearly as frequently as Googlebot is crawling?
If you haven't already tried this, I would send Google a report about it. There could be a technical problem with their crawler. Although it's more focused toward speed of requests than total bandwidth, I think this form would be a good place to start:
Googlebot Trouble Report [google.com]
[edited by: tedster at 6:26 pm (utc) on May 13, 2006]
|95% of that traffic has been all Googlebot |
Assuming his hosting doesn't have some antiquated low bandwidth allocation, that would be indicative of a more fundamental problem.
Are you getting googlebot stuck in a trap? Session ID's etc?