homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Definitive Sources of IP Lists

 8:03 am on Nov 28, 2012 (gmt 0)

Been looking for some definitive IP list sources maintained by the owner and not guessed at by 3rd parties by observation and a couple I came across are:

Microsoft's Azure Cloud Services provides an XML file:

Google provides a simple list in DNS from the command line:
nslookup -type=TXT _netblocks.google.com

Which returns:
_netblocks.google.com text = "v=spf1 ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ?all"

Anyone else got any direct IP list sources like these for Yahoo, Bing, etc.?



 12:39 pm on Nov 28, 2012 (gmt 0)

That's a great list - thank you for sharing.


 7:29 pm on Nov 28, 2012 (gmt 0)

Thought I'd add the IP assignments per country for those new to access control that don't know about these:


Anyone got anything else good, anyone?


 9:48 pm on Nov 28, 2012 (gmt 0)

The MS lists are (mostly) very short ranges! A real pain to add to blocklists. Probably worth doing, though, if they are clouds.

The v=spf returns are for mail (part of DKIM encoding/checking) - they specify IPs that can send mail so are probably not a threat to web servers.


Looking more closely at the MS ranges: some can be concatenated; others I already have listed as bot IPs. Does that mean they are changing usage?


 11:40 pm on Nov 28, 2012 (gmt 0)

From what I read the Google SPF lists are pretty all encompassing because the change IPs for stuff all the time and that's pretty much the list I had too. Says right on their website that it's where to get the definitive list of IPs and that comment was in the Google Developer section I think and wasn't an email discussion.

MS IP usage is all over the map so who knows. I think any IP is fair game at MS to be used for anything.

I just want sources that I can download and process automatically and keep updated without going stale and these look good so far but you're right, the MS list needs some concatenation.


 8:38 pm on Nov 29, 2012 (gmt 0)

It figures that G would abuse the spf system. :(

I've been wanting to ban G mail for a long time but my customers and their customers still insist on using it.

A while ago I began blocking all "bot company" IP ranges except for known bot IPs. This seems to block some "mobile" proxies but since Y and G seem to use these IPs indiscriminately, tough.


 1:44 pm on Dec 1, 2012 (gmt 0)

From the Azure XML file:
Now I see "msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)" coming from <=> msnbot-65-55-211-186.search.msn.com. This Azure IP list is garbage.


 2:38 pm on Dec 1, 2012 (gmt 0)

I started using BGP Toolkit from Hurricane Electric site for finding IP Ranges that belong to specific organization/colo/datacenters a while back. These are on more granular level. Seems to work OK.

Goog IPs for example: bgp.he.net/search?search[search]=google .

The RIR lists from FTP Sites that incrediBILL mentioned are also pulled on monthly bases and processed into a DB. Also have a subscription for ip2location DB.


 8:42 pm on Dec 1, 2012 (gmt 0)

Interesting stuff for sure. I use ip2location as well, but maxmind has a really great one as well that I use on an ad need basis. You can look up 25 per hour free on their site.
GeoIP City
GeoIP Country
GeoIP Region
GeoIP Organization (I especially like this one)
GeoIP Netspeed
GeoIP Domain Name

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved