Forum Moderators: open

Message Too Old, No Replies

Is it Okay to Block AS8075 MICROSOFT

I get scraped from non-MSN bots from that range

         

martinibuster

5:56 am on Jan 11, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I continually get scrapers and hackers coming from MSN IP addys.

I'm assuming these are Azure IP addys and won't get crossed with MSNBot...?

AS8075

20.33.0.0 - 20.128.255.255
20.40.0.0/13
20.128.0.0/16
20.36.0.0/14
20.33.0.0/16
20.48.0.0/12
20.64.0.0/10
20.34.0.0/15

not2easy

1:06 pm on Jan 11, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The AS name "AS8075 " is listed as a host, IPs via Direct Allocation - but not all of that range is part of the same AS name. 20.128.0.0/16 (US/NJ) for example, is listed as
as: ""

20.40.0.0 is listed as
"Microsoft Azure Cloud (westindia)"


ARIN whois lists 20.33.0.0 - 20.128.255.255 as "NetType: Direct Allocation" but I do not believe that is accurate given the as: " " mentioned. Do you see any non-bot or humanoid activity from the range?

My notes show nothing for that range, either as human traffic or bots. Maybe others here have better data for the range?

lucy24

5:30 pm on Jan 11, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: detour to raw logs ::

Aside from the DDG favicons-bot, which has a specific 1.2.3.4 IP, I don’t find anything legitimate from 20 (the entire /8). In fact, searching for 200 responses--a definite minority--reveals nothing but requests for images and sometimes other supporting files, always without the relevant page. Wonder what that’s about? (But while wondering, I’ll just go set the bad_range flag.)

SumGuy

2:51 am on Jan 14, 2022 (gmt 0)

5+ Year Member Top Contributors Of The Month



I block all of MSFT from hitting my web server if it's not bing-bot. Same goes for google and google-bot. Same for amazon (offhand I don't recall any legit bot using aws). Doing the same for any of these in terms of smtp blocking is more tricky, but I think I've got that under control over the past 10 or so years. All blocking (based on IP only) happens in my router, a silent drop. The web / SMTP servers don't see them.

blend27

4:26 am on Jan 24, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



99.n% of bots that come from MSFT have MIS-configured HTTP headers.

---------------------------------------
ip: 20.82.186.214
time: {ts '2021-11-22 02:38:33'}
---------------------------------------
connection: keep-alive
accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36
host: example.com
content-length: 0
Keep-Alive: 300
----------------------------------------
^ that is ALL she wrote..... with Chrome/80 UA <<bad botty, baaaad...


----------------------------------------

I have seen a hUMAN or 2 over the years from MSFT ranges but 2-3 clicks and they are gone.

I use bgp.he.net to lookup IP Info, although I have written the code to get based on freely avalable data from ARIN, RIPE and such.

Here is the output .he.net give for your range:
----
NetRange: 20.33.0.0 - 20.128.255.255
CIDR: 20.40.0.0/13, 20.48.0.0/12, 20.64.0.0/10, 20.128.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.33.0.0/16
---
That is sliced and diced all the way.

....and here is the rest of IPV4: bgp.he.net/AS8075#_prefixes <<<<< this one makes my eyes bleed...

20.191.45.212 >>> UA: Mozilla/5.0 (compatible; DuckDuckGo-Favicons-Bot/1.0; +http://duckduckgo.com) for a long time now.