Forum Moderators: open

Message Too Old, No Replies

Googlebot uses Godaddy.

Update to Earlier Thread

         

keyplyr

12:33 am on Mar 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(original thread [webmasterworld.com])

Recap: For 5 days in late January '09 an authentic Googlebot crawled part of my Godaddy hosted site. The server logs showed it coming from IP addresses owned by Godaddy. I whitelist UAs according to authenticated IP ranges, so these requests got 403'd, showed as errors at Google Webmaster Tools, then eventually were dropped from Google index.

Explanation: According to Godaddy Support (took only to 38 days) when their admins are alerted to a security issue that merits a certain level of action, affected hosting accounts are put behind a firewall. During this time some requests might be logged as coming from within the Godaddy network itself. This action should only affect those hosting accounts that use a dedicated IP address.

So some (unrelated I assume) security issue warranted this level of action from the Godaddy admins. Only problem with this explanation is that the site in question was *not* using a dedicated IP address at the time. I added the dedicated IP address later as a preventative measure (suggested by Godaddy support LOL) because of this blunder. Needless to say, I removed the dedicated IP address and was given a full refund by Godaddy.

So there it is. Godaddy takes responsibility, but is ambiguous and contradictory in their explanation. They won't define their measures any further citing security issues. I assume I am still at risk, but at least I now know that Googlebot did not crawl under a non-authenticated IP range nor stumble upon an open proxy.

My site has only just recovered; the last of 50 dropped web pages being added back yesterday. Loss of ad revenue, traffic, ranking, and future potential all considered a write-off due to shared hosting.

incrediBILL

12:47 am on Mar 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not buying that it was an authentic Googlebot but I know you obviously saw something that was very unusual.

keyplyr

5:23 am on Mar 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not buying that it was an authentic Googlebot but I know you obviously saw something that was very unusual.

I don't think you understood what I said. It was obviously the real Googlebot since the very same 50 web pages that were 403'd showed as HTTP errors at GWT and were eventually dropped from the Google index, but all this was unknown to me in January when I opened the original thread. This is why I posted this update :)

GaryK

5:59 pm on Mar 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I followed your original thread. It seemed questionable that Google managed to somehow fall into a proxy of some kind and began crawling from the proxy's IP Address. But stranger things have happened so I gave you the benefit of the doubt. Also, there did seem to be some correlation between what you saw in your logs and what you saw on GWT.

What I don't understand is your current thread. What sort of security issue existed that prompted GoDaddy to do something? I'm not even sure what they did.

jdMorgan

7:39 pm on Mar 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It sounds like GoDaddy provided the "proxy" by forwarding requests from their firewall. Therefore, the Remote_Addr might have been within a GoDaddy range. But at the same time, you might also have been able to check for an HTTP Via and/or X-Forwarded-For header.

One thing that might help is to call a special logging script when your custom 403 error page is accessed. This script can log the 403 error separately from your usual error log, and attempt to log *all known* HTTP headers. You see some interesting things when you do this... things such as the HTTP From header (which with a genuine Googlebot access will always give an e-mail address at Google), the proxy-related headers, and the Accept, Accept-Encoding, and Accept-Language headers, all of which could also be checked for validity...

Jim

keyplyr

11:01 pm on Mar 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What I don't understand is your current thread. What sort of security issue existed that prompted GoDaddy to do something? I'm not even sure what they did.

Well since it's a "security issue" they aren't saying (catch 22)

It sounds like GoDaddy provided the "proxy" by forwarding requests from their firewall. Therefore, the Remote_Addr might have been within a GoDaddy range. But at the same time, you might also have been able to check for an HTTP Via and/or X-Forwarded-For header.

Access to "HTTP Via and/or X-Forwarded-For headers" is not offered as part of the shared environment at Godaddy.

My intent of this follow-up thread is to show the reason why Googlebot logged from an IP address owned by Godaddy so that other webmasters may have an account to reference. I did not have this account when I searched for an answer when all this occurred so many more web pages fell victim until I figured it out.