homepage Welcome to WebmasterWorld Guest from 54.161.247.22
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
stremorbot
wilderness




msg:4694973
 11:33 pm on Aug 12, 2014 (gmt 0)

There's a closed 2013 thread on stremor [webmasterworld.com] by lucy

107.178.200.5 - - [12/Aug/2014:15:36:41 -0600] "GET /robots.txt HTTP/1.1" 200 1115 "-" "stremorbot AppEngine-Google; (+http://code.google.com/appengine; appid: s~stremor-crawler)"

followed by six successive requests for the same page from two slightly different UA's.

Returned nine minutes later with four requests for the same page duplicating the slightly different UA's..

"AppEngine-Google; (+http://code.google.com/appengine; appid: s~liquid-helium)"

 

keyplyr




msg:4695391
 11:42 pm on Aug 13, 2014 (gmt 0)

At my servers, that guy would get blocked by several reasons:

1.) by range: 107.178.192.0/18 Google Cloud

2.) by UA "AppEngine"

3.) by UA "code"

4.) by UA "crawler"

...and I haven't even seen the request headers yet - just say'n :)

wilderness




msg:4695396
 12:20 am on Aug 14, 2014 (gmt 0)

At my servers, that guy would get blocked by several reasons:


Same for me, however I'd not had the IP range denied until now.

Are there more Google Clouds?

GOOGLE-CLOUD 2600:1900:: - GOOGLE-CLOUD 104.154.0.0 - 104.155.255.255 104.154.0.0/15
GOOGLE-CLOUD 107.167.160.0 - 107.167.191.255 107.167.160.0/19
GOOGLE-CLOUD 107.178.192.0 - 107.178.255.255 107.178.192.0/18
GOOGLE-CLOUD 108.59.80.0 - 108.59.95.255 108.59.80.0/20
GOOGLE-CLOUD 130.211.0.0 - 130.211.255.255 130.211.0.0/16
GOOGLE-CLOUD 146.148.0.0 - 146.148.127.255 146.148.0.0/17
GOOGLE-CLOUD 162.216.148.0 - 162.216.151.255 162.216.148.0/22
GOOGLE-CLOUD 162.222.176.0 - 162.222.183.255 162.222.176.0/21
GOOGLE-CLOUD 173.255.112.0 - 173.255.127.255 173.255.112.0/20
GOOGLE-CLOUD 192.158.28.0 - 192.158.31.255 192.158.28.0/22
GOOGLE-CLOUD 199.192.112.0 - 199.192.115.255 199.192.112.0/22
GOOGLE-CLOUD 199.223.232.0 - 199.223.239.255 199.223.232.0/21
GOOGLE-CLOUD 23.236.48.0 - 23.236.63.255 23.236.48.0/20
GOOGLE-CLOUD 23.251.128.0 - 23.251.159.255 23.251.128.0/19
GOOGLE-CLOUD 2600:1900:: - 2600:190F:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF

not2easy




msg:4695411
 1:38 am on Aug 14, 2014 (gmt 0)

I have a few more here:
8.34.208.0 - 8.34.215.255 8.34.208.0/21
8.34.216.0 - 8.34.223.255 8.34.216.0/21
8.35.192.0 - 8.35.199.255 8.35.192.0/21
8.35.200.0 - 8.35.207.255 8.35.200.0/21
66.102.0.0 - 66.102.15.255 66.102.0.0/20
108.170.192.0 - 108.170.255.255 108.170.192.0/18

I am pretty sure there are more.

wilderness




msg:4706617
 9:04 am on Oct 6, 2014 (gmt 0)

Would have never believed that I'd be required to add this complete set of IP's in such a short while.

With six of the ranges have been utilized this thread opened, the remaining were completed.

keyplyr




msg:4706618
 9:20 am on Oct 6, 2014 (gmt 0)

Just a FYI - As with most things, these ranges may not be something all webmasters should block. Some of those ranges are used for things that some may find beneficial.

For example, this Google Cloud range retrieves the images used when you or someone else posts a link to one of your pages at Google+:
66.102.0.0 - 66.102.15.255
66.102.0.0/20

Having a nice, pretty image to click on brings in a lot more traffic than just a text link. Two more of the above ranges are used by apps (from Google Play) that also may be beneficial to some site owners. Mobile is a huge traffic source and one that should be nurtured, not blocked IMO. Personally, I'm a big fan of incoming traffic.

wilderness




msg:4706628
 10:14 am on Oct 6, 2014 (gmt 0)

Personally, I'm a big fan of incoming traffic.


As am I. just not from obscured IP's.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved