Forum Moderators: open

Message Too Old, No Replies

Fake Googlebots - why are they here?

Asking about why there are so many fake Googlebots accessing our site.

         

wizboy

9:07 am on Jul 14, 2009 (gmt 0)

10+ Year Member



It seems like we have been identifying a lot of fake Googlebots accessing our website, and many of them seem to have very slow access, and seemingly requesting the same pages multiple times in 1 second. Do you guys see this happening? Why are they doing that? Is this some kind of denial of service attack, trying to consumer resources of my servers such that when the real Googlebot comes, it cannot read the pages?

My other question is if do the forward/reverse lookup as Google suggested, we identified that from a few days' data, all Google accesses seem to be coming from 66.249.x.x. Can I trust that? We don't want to just allow this IP range, and then Google comes from a different IP range, and we disallow it.

Fake Googlebot IPs we detected in a few days' log:
<IP list removed>

[edited by: incrediBILL at 2:27 pm (utc) on July 14, 2009]
[edit reason] Obscured IPs, too many to edit so removed list [/edit]

incrediBILL

3:04 pm on Jul 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I haven't worried about fakes coming from Google, Yahoo, MSN or Ask for a couple of years now.

Just validate Google via the round trip DNS like they suggest and all the fakes are a non-issues, discarded at the door.

FYI, often the fake googlebots are real googlebots trying to crawl via a proxy server after being tricked by the proxy owner trying to hijack your content.

GaryK

4:04 pm on Jul 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



FYI, often the fake googlebots are real googlebots trying to crawl via a proxy server after being tricked by the proxy owner trying to hijack your content.

I'm sure the next questions will be: Is it safe to block these? Will it affect my Google ranking? So I'm asking them now. :)

wilderness

4:20 pm on Jul 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Gary,
I've had anything from Google that doesn't come from the Class B provided by Wizboy, for some while now and hasn't bothered my sites.

In addition, I had some other Google tools denied even prior to the Class B range.

Don

wizboy

5:48 pm on Jul 14, 2009 (gmt 0)

10+ Year Member



What is "Class B"?

So you are saying to block anything outside this 66.249.x.x range?

so if it is "googlebots trying to crawl via a proxy server", do you mean that is these spam websites that try to steal our contents, and pretend the contents are sitting at these other IP locations, right? So that sound like definitely good to block these, right?

wilderness

6:30 pm on Jul 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's a couple of older threads on this same topic [webmasterworld.com].

123(Class A).456(Class B).789(Class C).012(Class D)

So you are saying to block anything outside this

I'm saying I do!
Rather you choose to do so depends upon what is beneficial or detrimental to your own website (s).

GaryK

8:28 pm on Jul 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don, I was asking the question cause I knew it was going to be asked. :)

wizboy, you're right in your assumption about what these fake googlbots typically do.

Personally, I deny access to anything claiming to be googlebot unless it passes a full round-trip DNS lookup. Whether anyone else does that is up to them, as Don so rightly stated.

incrediBILL

3:12 am on Jul 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is it safe to block these? Will it affect my Google ranking?

<mod off>
Speaking from my own experience, blocking them appears to be perfectly fine.

Yes, it might affect your Google ranking, mine went UP because blocking all the bogus Googlebots stopped proxy hijacking.
</mod off>