Welcome to WebmasterWorld Guest from 18.232.171.18

Forum Moderators: Ocean10000

Seeing hits from 3.0.0.0/8 for first time

Is Amazon

     
2:32 am on Feb 3, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 8, 2016
posts:95
votes: 0


The first ever hits to my website from 3.anything happened on Nov 19 last year from 3.16.217.54 (Amazon). A few days later I added all the CIDR's from AS16509 to my router's blocking list. I see that a couple hits somehow got through from 3.8.17.44 on Dec 25, and today my .html files got scraped from 3.17.157.70. Not sure why - they should have been blocked at the router. So I blocked them at the server for now.

The hit today had this UA:

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.34 (KHTML like Gecko) Qt/4.8.2

If more of this stuff gets through I'll probably block the entire 3/8 in the server. Don't know who else is using that /8. That makes two /8's that Amazon has polluted. Why does Bozo need so much IP space?
6:39 am on Feb 3, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15756
votes: 828


The first ever
Huh. I did a quick log search because I couldn't remember one way or the other, and found a scattered handful from August-October, with a bit of a leap upward in November, a bigger leap in December, and a still bigger leap in January (twice as many requests as all of 2018 put together).

:: poring over details ::

The overwhelming majority (97%) are from 3.16-17. An even more overwhelming majority were blocked; the non-blocked ones turned out to be several authorized robots from 3.elsewhere (loosely, .80-85 and .120-121 which, er, I guess means this area is no longer General Electric). An equally vast majority requested only the root. (Recently in another thread I looked at root requests from all sources and was staggered to find that, overall, something like 2/3 or 3/4 or them were blocked.)

On the blocked side, User-Agents include PleskBot, something containing the suspicion-provoking string "tracemyfile", and something citing empolis dot com, but mostly humanoid UAs. The only thing I don’t see yet are requests getting a 418 response (mod_security, mostly Chrome/40.whatever-it-was).

It's probably safe to block 3.16.0.0/15. If you allow distributed robots, take a closer look to see if any of them use other parts of 3.
6:05 pm on Feb 3, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3631
votes: 365


Several years ago I decided that it's more trouble than it's worth to try to block every bad bot. I have six sites, and it would consume a lot of time to try to keep the bots out of all of them.

So now I only block bots that come around often enough to annoy me. Most of the others will eventually stop coming anyway.
6:41 pm on Feb 3, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15756
votes: 828


My several-years-ago decision was to stop playing whack-a-mole with every new server farm, and change over to header-based access controls. (For example, a single header check took care of an entire unwanted geographical region that formerly took up dozens and dozens of “Deny from...” lines.) Yes, a few humanoid robots get in, but the vast majority don't: Consider only what I found looking up requests from 3, an IP range I have never given a moment's thought to. Out of sight, out of mind.

Admittedly I do cheat a little by setting an environmental variable called bad_range for selected addresses that are almost certainly up to no good (and then unsetting it for distributed robots by name). But that's rare.
8:04 pm on Feb 3, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3631
votes: 365


Yes Lucy- I remember reading what you've said previously about doing it that way, and I'm sure it's better than what I do. But I would have to do a lot of study before I would be able to set that method up.

I do have one quick question: Does that method work against botnets?
9:02 pm on Feb 3, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15756
votes: 828


Does that method work against botnets?
Depends on how cleverly the robot is coded. Even in this day and age, a remarkable number of robots don't even send a User-Agent. I do get a few fully humanoid robots every day, and most of them come from all over the map. At any given time, there are certain patterns of requests that are clearly a botnet but there's not a heck of a lot I can do with them. Unfortunately they tend to involve the Contact page--which isn't in an utterly generic location, like /contact.html at the root, so I know that somebody once located it and told all the other robots.

My other happy discovery is that, since I'm already rewriting robots.txt requests to a php file, I can also serve up a truncated robots.txt with blanket Disallow to certain agents that are obviously lying in their teeth (either by claiming to be human browsers, or by sending a referer). Yup, the environmental variable is called lying_bot. What else could I call it? :) Some humans will see the same thing, but hey, what were they snooping around for anyway.
10:08 pm on Feb 3, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3631
votes: 365


Lucy -- A few years ago you showed me how to block a botnet from one of my sites that uses my home page as a kind of "self-referer". I've always appreciated your help with that. That botnet still active.

I was able to block another botnet because of an oddity in the UA string it uses.

But some botnets could conceivably be almost impossible to block. Luckily As long as they stay at a low level of activity, they don't do much harm.
2:41 am on Feb 4, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 8, 2016
posts:95
votes: 0


I don't focus on the bots (per say). I see a hit that looks to be machine-generated (aside from google, bing, yahoo and corporate web-surfing aids and proxy's) , I hunt down the IP and AS and either block the entire /16 the IP is in, or I block the AS (if it's a 2-bit operation). If the AS is a big outfit, but it's in a select group of countries, I'll block it regardless how big it is. Currently blocking about 20k CIDR's that amounts to about 510 million IPv4. I've been doing this sort of IP blocking on my mail server for years, and I'm blocking something like 93% of all "in-use" IPv4 IP's from connecting to my mail server. I get very few direct-to-mx spams for the past few years now.
7:15 pm on Aug 13, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 26, 2006
posts: 1650
votes: 3


I've just recently had some bot/hack attempt hits from
3.208.0.0/12 this range of Amazon.
3.219.218.246 (United States)8/13/2019 3:15:53 PM/cdnbuster=1565706879851229301 - Moved PermanentlyNo Referrerpython-requests/2.22.0

3.224.211.194 (United States)8/13/2019 10:49:54 AM/cdnbuster=1565691116&cdnbuster=156569111658.57 KB192200 - OKNo ReferrercdnposUserAgentcdnpos<>
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members