Welcome to WebmasterWorld Guest from 107.22.7.35

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Google's Site Verification Bot

A late warning

     
1:10 pm on Jun 25, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Probably old news to you log-addicts.

My back was turned for a few months executing duties, and G had the nerve to switch its site verification bot IP to 66.249.80.232 without telling me. This resulted in my sites losing their status as verified, because I have long blocked the irritating IP block used for Google Plus snippets

# Google Plus snippets
deny from 66.249.80.0/20 66.249.84.226 66.249.81.145

This line was blocking verification.

So I deigned to allow the bot's IP thus...

# Allow Google Site Verification bot
allow from 66.249.80.232

That got my main site verified, but not two others. To achieve verification on those, I had to remove my deny of 66.249.80.0/20 then they were verified.

Loss of verification happened mid June 2014, but doesn't appear to have affected our ranking in G serp.
6:36 pm on Jun 25, 2014 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Thanks for the heads-up. Made me check my own filters. I'm a bit more liberal in the ranges I let through for G so even with that recent change, I was OK... but ya never know :)
6:57 pm on Jun 25, 2014 (gmt 0)

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



And I recently blocked it because of proxy referer spam traffic. That is listed as a Google Proxy server. Now I need to look closer.
IP Address 66.249.80.232
Host google-proxy-66-249-80-232.google.com
7:39 pm on Jun 25, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Ah. I knew they must have changed something, because site verification has been showing up in my logs recently, although it's supposed to be ignored. (I'm talking here about my personal log-wrangling routines, not the original raw logs.)

I had to remove my deny of 66.249.80.0/20 then they were verified.

That's where conditional RewriteRules come in.

RewriteCond {unwanted IP range}
RewriteCond {request-URI is not site-verification-thingy}
RewriteRule {blahblah ending in [F] }
7:56 pm on Jun 25, 2014 (gmt 0)

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



You're right, lucy24 because just a quick check shows that the UA:
"Mozilla/5.0 (compatible; Google-Site-Verification/1.0)"
is coming from several IPs in that neighborhood: 66.249.90.185 and 66.249.90.74 recently.
8:29 pm on Jun 25, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: detour to look up, which I should have done earlier ::

Ah. The "ignore" code has
^(72\.14|209\.8[45])\.\d+\.\d+

so until recently they simply didn't use 66.249.whatever at all. Is it bad when you can type someone's IP from memory? Now duly added.

I guess the time to worry is when random visitors start asking for the correct google verification file, because nobody else should even know its name. Considering the length of its name ("google" followed by 16 alphanumerics), it's pretty unlikely a robot would find it by blind luck.

:: detour to calculator ::

Well, it's got 24 zeros ;)
6:32 am on Jun 26, 2014 (gmt 0)



Google proxy came through today using 66.249.81.153. Grabbed an image and left.

Looks like work to do.
11:17 am on Jun 26, 2014 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



There's a fairly recent thread where somebody noted that the primary bot only utilizes thru the 79 Class C

RewriteCond %{REMOTE_ADDR} !^66\.249\.(6[4-9]|[7][0-9])\.
11:54 am on Jun 26, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Blimey, and I thought +I'd+ been neglecting my logs, not having glanced at them for trois months.
Tchek!
5:03 pm on Jun 26, 2014 (gmt 0)



So, what I have for a valid googlebot in my .htaccess file is,

RewriteCond %{REMOTE_ADDR} ^66\.249\.(6[4-9]|7[0-9]|8[0-46-9]|9[0-5])\.

from JDMorgan at [webmasterworld.com...]

remains true, but inside that range the google proxy is operating.

Should i rewrite this code to limit to the 79. class C as wilderness as suggested?

What is the google stuff that I would not be allowing besides the proxy?
8:49 pm on Jun 26, 2014 (gmt 0)



Deleted message posted to wrong forum.
9:21 pm on Jun 26, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



RewriteCond %{REMOTE_ADDR} ^66\.249\.(6[4-9]|7[0-9]|8[0-46-9]|9[0-5])\.

Does he explain why 66.249.85 is exempt? Yes, he probably does. But anything from JDMorgan will be several years old, so it's worth re-checking. If you didn't have that .85 loophole, the 70's and 80's could be reduced to

|[78]\d|


:: detour to logs ::

I don't think there's any difference at this point. At least not for the common entities like favicon or Preview.

But or most purposes,
66.249.64.0/20
i.e. 66\.249\.(6[4-9]|7\d)
should probably be handled separately from
66.249.80.0/20
i.e. 66\.249\.(8\d|9[0-5])
where the first is crawl, the second is assorted Googloid entities including Preview, Translate, favicon and so on. Thoughtful of them to use exactly this pair of /20 ranges, since it lets you split neatly at 7x|8x ;)
 

Featured Threads

Hot Threads This Week

Hot Threads This Month