Forum Moderators: open
Recap: For 5 days in late January '09 an authentic Googlebot crawled part of my Godaddy hosted site. The server logs showed it coming from IP addresses owned by Godaddy. I whitelist UAs according to authenticated IP ranges, so these requests got 403'd, showed as errors at Google Webmaster Tools, then eventually were dropped from Google index.
Explanation: According to Godaddy Support (took only to 38 days) when their admins are alerted to a security issue that merits a certain level of action, affected hosting accounts are put behind a firewall. During this time some requests might be logged as coming from within the Godaddy network itself. This action should only affect those hosting accounts that use a dedicated IP address.
So some (unrelated I assume) security issue warranted this level of action from the Godaddy admins. Only problem with this explanation is that the site in question was *not* using a dedicated IP address at the time. I added the dedicated IP address later as a preventative measure (suggested by Godaddy support LOL) because of this blunder. Needless to say, I removed the dedicated IP address and was given a full refund by Godaddy.
So there it is. Godaddy takes responsibility, but is ambiguous and contradictory in their explanation. They won't define their measures any further citing security issues. I assume I am still at risk, but at least I now know that Googlebot did not crawl under a non-authenticated IP range nor stumble upon an open proxy.
My site has only just recovered; the last of 50 dropped web pages being added back yesterday. Loss of ad revenue, traffic, ranking, and future potential all considered a write-off due to shared hosting.
I'm not buying that it was an authentic Googlebot but I know you obviously saw something that was very unusual.
What I don't understand is your current thread. What sort of security issue existed that prompted GoDaddy to do something? I'm not even sure what they did.
One thing that might help is to call a special logging script when your custom 403 error page is accessed. This script can log the 403 error separately from your usual error log, and attempt to log *all known* HTTP headers. You see some interesting things when you do this... things such as the HTTP From header (which with a genuine Googlebot access will always give an e-mail address at Google), the proxy-related headers, and the Accept, Accept-Encoding, and Accept-Language headers, all of which could also be checked for validity...
Jim
What I don't understand is your current thread. What sort of security issue existed that prompted GoDaddy to do something? I'm not even sure what they did.
It sounds like GoDaddy provided the "proxy" by forwarding requests from their firewall. Therefore, the Remote_Addr might have been within a GoDaddy range. But at the same time, you might also have been able to check for an HTTP Via and/or X-Forwarded-For header.
My intent of this follow-up thread is to show the reason why Googlebot logged from an IP address owned by Godaddy so that other webmasters may have an account to reference. I did not have this account when I searched for an answer when all this occurred so many more web pages fell victim until I figured it out.