homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Google's Site Verification Bot
A late warning

 1:10 pm on Jun 25, 2014 (gmt 0)

Probably old news to you log-addicts.

My back was turned for a few months executing duties, and G had the nerve to switch its site verification bot IP to without telling me. This resulted in my sites losing their status as verified, because I have long blocked the irritating IP block used for Google Plus snippets

# Google Plus snippets
deny from

This line was blocking verification.

So I deigned to allow the bot's IP thus...

# Allow Google Site Verification bot
allow from

That got my main site verified, but not two others. To achieve verification on those, I had to remove my deny of then they were verified.

Loss of verification happened mid June 2014, but doesn't appear to have affected our ranking in G serp.



 6:36 pm on Jun 25, 2014 (gmt 0)

Thanks for the heads-up. Made me check my own filters. I'm a bit more liberal in the ranges I let through for G so even with that recent change, I was OK... but ya never know :)


 6:57 pm on Jun 25, 2014 (gmt 0)

And I recently blocked it because of proxy referer spam traffic. That is listed as a Google Proxy server. Now I need to look closer.
IP Address
Host google-proxy-66-249-80-232.google.com


 7:39 pm on Jun 25, 2014 (gmt 0)

Ah. I knew they must have changed something, because site verification has been showing up in my logs recently, although it's supposed to be ignored. (I'm talking here about my personal log-wrangling routines, not the original raw logs.)

I had to remove my deny of then they were verified.

That's where conditional RewriteRules come in.

RewriteCond {unwanted IP range}
RewriteCond {request-URI is not site-verification-thingy}
RewriteRule {blahblah ending in [F] }


 7:56 pm on Jun 25, 2014 (gmt 0)

You're right, lucy24 because just a quick check shows that the UA:
"Mozilla/5.0 (compatible; Google-Site-Verification/1.0)"
is coming from several IPs in that neighborhood: and recently.


 8:29 pm on Jun 25, 2014 (gmt 0)

:: detour to look up, which I should have done earlier ::

Ah. The "ignore" code has
so until recently they simply didn't use 66.249.whatever at all. Is it bad when you can type someone's IP from memory? Now duly added.

I guess the time to worry is when random visitors start asking for the correct google verification file, because nobody else should even know its name. Considering the length of its name ("google" followed by 16 alphanumerics), it's pretty unlikely a robot would find it by blind luck.

:: detour to calculator ::

Well, it's got 24 zeros ;)


 6:32 am on Jun 26, 2014 (gmt 0)

Google proxy came through today using Grabbed an image and left.

Looks like work to do.


 11:17 am on Jun 26, 2014 (gmt 0)

There's a fairly recent thread where somebody noted that the primary bot only utilizes thru the 79 Class C

RewriteCond %{REMOTE_ADDR} !^66\.249\.(6[4-9]|[7][0-9])\.


 11:54 am on Jun 26, 2014 (gmt 0)

Blimey, and I thought +I'd+ been neglecting my logs, not having glanced at them for trois months.


 5:03 pm on Jun 26, 2014 (gmt 0)

So, what I have for a valid googlebot in my .htaccess file is,

RewriteCond %{REMOTE_ADDR} ^66\.249\.(6[4-9]|7[0-9]|8[0-46-9]|9[0-5])\.

from JDMorgan at [webmasterworld.com...]

remains true, but inside that range the google proxy is operating.

Should i rewrite this code to limit to the 79. class C as wilderness as suggested?

What is the google stuff that I would not be allowing besides the proxy?


 8:49 pm on Jun 26, 2014 (gmt 0)

Deleted message posted to wrong forum.


 9:21 pm on Jun 26, 2014 (gmt 0)

RewriteCond %{REMOTE_ADDR} ^66\.249\.(6[4-9]|7[0-9]|8[0-46-9]|9[0-5])\.

Does he explain why 66.249.85 is exempt? Yes, he probably does. But anything from JDMorgan will be several years old, so it's worth re-checking. If you didn't have that .85 loophole, the 70's and 80's could be reduced to


:: detour to logs ::

I don't think there's any difference at this point. At least not for the common entities like favicon or Preview.

But or most purposes,
i.e. 66\.249\.(6[4-9]|7\d)
should probably be handled separately from
i.e. 66\.249\.(8\d|9[0-5])
where the first is crawl, the second is assorted Googloid entities including Preview, Translate, favicon and so on. Thoughtful of them to use exactly this pair of /20 ranges, since it lets you split neatly at 7x|8x ;)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved