Forum Moderators: open

Message Too Old, No Replies

OmniBot

Microsoft Corp.

         

Pfui

6:29 pm on Apr 30, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



At least twice in the last five days, two different URIs:

104.214.235.204
Mozilla/5.0 (compatible; OmniBot/1.0)

robots.txt? Yes BUT promptly ignored each time.

Microsoft Corporation: 104.208.0.0 - 104.215.255.255 (104.208.0.0/13)

keyplyr

12:18 am on May 3, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Think their bot may have once been known as: OmniWeb?

Pfui

1:48 am on May 3, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OmniWeb? Well, nope. Google sez OmniWeb's a Mac browser that's been around since at least 2001, and is still going. [omnigroup.com...]

OmniBot the bot may be something too old for much data, or too new, because there's very little info for the specific UA string. Here's another mention of what I saw, also from the same Microsoft IP: [botsvsbrowsers.com...]

Most intriguing to me is Microsoft running an unknown, unannounced, robots.txt-ignoring bot from a bare IP.

keyplyr

3:38 am on May 3, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Browser maybe as well but I'm referring to Omniweb the bot from that same M$ range, circa 2012 and where there's a bot, there's a bot runner. So who is it?

lucy24

5:06 am on May 3, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



104.208.0.0/13

What I found odd-- after re-checking raw logs --is that as far as I know, I have never in my life met anyone from this range. Ever. And that's just weird for a /13 (of any kind, anywhere). I didn't even know it was Microsoft, although free lookup says they've had it since 1997. What on earth have they been doing with it all this time?

dstiles

7:53 pm on May 4, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I block the range as probable servers, as I do with most MS that hasn't identified itself. Possibly business cloud storage?

trintragula

3:19 pm on May 13, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I've just seen this bot from 104.40./13. I also have records of various other kinds of misbehaviour from that range all of it since Christmas 2014.

The Microsoft Azure (cloud) ranges can be downloaded here: [microsoft.com...] (assuming the link is right). Pfui's original sighting also appears to be from one of those ranges.
If you're watching AWS, you're probably going to have to watch Azure as well.

trintragula

5:02 pm on May 13, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Addendum:
This file contains the Compute IP address ranges (including SQL ranges) used by the Microsoft Azure Datacenters. A new xml file will be uploaded every Wednesday (Pacific Time) with the new planned IP address ranges. New IP address ranges will be effective on the following Monday (Pacific Time). Please download the new xml file and perform the necessary changes on your site before Monday.

So it's a regular automated update - same as AWS.
It's a shame they haven't coordinated their data formats. JSON for one, XML for the other. Oh well.

I looked for DigitalOcean to see if they publish their ranges and found this question from last year:
[digitalocean.com...]
Other cloud providers regularly publish the public IP address ranges that can be assigned to customer virtual machines. This information is required if a company wishes to control outgoing connections from its intranet.

and DigitalOcean's reply:
We don't currently publish our public IP ranges. We are getting new ones all the time, so it would likely be out of date shortly. Though you might want to submit the idea to UserVoice to demonstrate that there is demand for this: [digitalocean.uservoice.com...]

The point of them doing it is that only they have up-to-date information, and they are in a good position to automate its publication... We can get out of date lists from lots of places.

So I looked on their UserVoice site and this idea has 15 votes, but not yet any followup from DO. A lot of the other suggestions have thousands of votes, so I'm guessing its not a very high priority for them, even though it would probably be pretty easy to do.

I have had a look for IP range lists published by OVH or Hetzner. No luck so far...

lucy24

7:38 pm on May 13, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



We are getting new ones all the time, so it would likely be out of date shortly.

I had no idea Digital Ocean's public web pages were designed by the same people who put out, let's say, TV schedules. If printing something on dead trees has a lead time of anywhere from a day and a half to a few weeks to several months, then obviously your website must also always be months out of date. So no point in publishing the information at all.

Pfui

7:39 pm on May 13, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



- OVH and Hetzner. Oy. So so so bad for so so so long.

- MS's OmniBot returned again today, but now from 104.40.228.49 (previously 104.214.235.204). Similar CIDR as seen by trintragula:

104.40.0.0 - 104.47.255.255
104.40.0.0/13

trintragula

7:39 pm on May 13, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I'm off topic and on a roll here - yeeee hah!

So what about Google cloud/AppEngine? Do they publish those?
They do!

Text file? Go program? Handy google SERP? Nooo... nslookup! Obvious choice!


C:\>nslookup -q=TXT _cloud-netblocks.googleusercontent.com 8.8.8.8
Server: google-public-dns-a.google.com
Address: 8.8.8.8

Non-authoritative answer:
_cloud-netblocks.googleusercontent.com text =

"v=spf1 include:_cloud-netblocks1.googleusercontent.com include:_cloud-netblocks2.googleusercontent.com include:_cloud-netblocks3.googleuserco
ntent.com include:_cloud-netblocks4.googleusercontent.com include:_cloud-netblocks5.googleusercontent.com ?all"
C:\>nslookup -q=TXT _cloud-netblocks1.googleusercontent.com 8.8.8.8
Server: google-public-dns-a.google.com
Address: 8.8.8.8

Non-authoritative answer:
_cloud-netblocks1.googleusercontent.com text =

"v=spf1 ip4:8.34.208.0/20 ip4:8.35.192.0/21 ip4:8.35.200.0/23 ip4:108.59.80.0/20 ip4:108.170.192.0/20 ip4:108.170.208.0/21 ip4:108.170.216.0/22 ip4:108.170.220.0/23 ip4:108.170.222.0/24 ?all"

... and so on, which results in:

8.34.208.0/20
8.35.192.0/21
8.35.200.0/23
23.236.48.0/20
23.251.128.0/19
104.154.0.0/15
104.196.0.0/14
107.167.160.0/19
107.178.192.0/18
108.59.80.0/20
108.170.192.0/20
108.170.208.0/21
108.170.216.0/22
108.170.220.0/23
108.170.222.0/24
130.211.4.0/22
130.211.8.0/21
130.211.16.0/20
130.211.32.0/19
130.211.64.0/18
130.211.128.0/17
146.148.2.0/23
146.148.4.0/22
146.148.8.0/21
146.148.16.0/20
146.148.32.0/19
146.148.64.0/18
162.216.148.0/22
162.222.176.0/21
173.255.112.0/20
192.158.28.0/22
199.192.112.0/22
199.223.232.0/22
199.223.236.0/23

... straight from the horses mouth.
A mere half million addresses, compared with Azure's 1.7 million and AWS's 12 million...

But at least you can keep up-to-date as often as you want.

trintragula

8:06 pm on May 13, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



If Omnibot comes from Microsoft Azure, then it's probably an MS cloud customer - not MS. Unless someone knows different. I suppose MS are entitled to use their own service, but I'm not assuming it's them.