Forum Moderators: open

Message Too Old, No Replies

today's snicker

         

lucy24

3:54 pm on May 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mozlila/5.0 (Linux; Android 7.0; SM-G892A Bulid/NRD90M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/60.0.3112.107 Moblie Safari/537.36

IP: all over the map
requests: most often /.env (what the heck is that, anyway?)
headers: generally humanoid (I can tell because they tend to get 404 rather than 403)

You gotta watch out for those moblie, bulid robots; it could be unhealthy, even when not treated with Mozlila. Is this the robotic equivalent of the spam emails that are intentionally littered with grammar and spelling errors?

:: wandering off to decide if it's even worth blocking ^Mozlila if all they ask for is a single nonexistent file ::

jmccormac

4:17 pm on May 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does it also request /.github ? Had to deal with a fairly diverse botnet over the past week and noticed some odd requests in the logs.

Regards...jmcc

lucy24

5:47 pm on May 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does it also request /.github ?
Surprisingly, no. I do find the element "github" in logs, but only as part of the UA string. Checking for other blocked or nonexistent dot-files, I do find the occasional /.gitignore ... which sounds like a “took the words right out of my mouth” filename. But that seems to be from unrelated robots.

tangor

11:28 pm on May 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does it also request /.github ?


On my site, yes.

phranque

1:08 am on May 18, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



/.env (what the heck is that, anyway?)

dotenv [pypi.org]
Reads the key-value pair from .env file and adds them to environment variable. It is great for managing app settings during development and in production using 12-factor principles.

lucy24

2:37 am on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



great for managing app settings during development and in production
Presumably, then, also great for providing some type of information that is useful to some type of malign robot. (I have yet to meet a good and useful robot that misspells three separate words in its UA string, though I suppose they might exist.)

wilderness

4:51 am on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lucy,
early Dec from MS Hosting. Blocked.
Had some request from other IP's afterwards. Don't even pay attention to small amounts afterward.

jmccormac

5:09 am on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So block on detection is the best strategy?

Regards...jmcc

wilderness

5:17 am on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Whatever works for you is best.
Each webmaster must decide what is beneficial or detrimental to their own site (s).

jmccormac

5:21 am on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Spent the last week dealing with a botnet trying to download millions of webpages. It used a combination of hoster and ISP IP addresses. Have been tailing the logs and might be a bit paranoid about iffy requests. :) Any leakage from a stealth bot could be useful.

Regards...jmcc

lucy24

3:22 pm on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It used a combination of hoster and ISP IP addresses.
That's why I shifted to primarily header-based blocking, along with unwanted UAs. Over the years I've had to add some IP ranges--most of them, big surprise, Hetzner or OVH--but headers are the first line of defense. I also maintain a short list of down-to-the-last-digit IPs from human ISP ranges that I assume represent infected human machines; these I check every few months and remove as they become inactive.

Like so many things, robots are a moving target. Remember a few years ago when anything starting in "Mozilla" (correctly spelled) could safely be assumed to be human?

jmccormac

3:55 pm on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One botnet was burning through about 2K ISP addresses a day. Some of the webscraper as a service sites that I looked at were advertising their services with tens of millions of ISP proxies. I'm not sure if the ISP boxes were compromised boxes or were just running some iffy app/browser add-on. A time-to-live value for each ISP IP sounds like a good idea.

In terms of hosters, Vultr, Digital Ocean, M247 were far ahead of Hetzner. Dealing with the hosters is a lot easier (using IncrediBill's very useful PHP script to check the IP against a table of known hoster ranges is one way of doing it). The only issue is that the hoster ranges may have some good bots or exits for corporate proxies. Even with the "aggregate" program on Linux, some of the hoster ranges end up being quite large.Most of the hoster IP ranges will never hit a site so a block on detection approach might be better unless the hoster has a history of problematic activity.

Regards...jmcc

lucy24

4:08 pm on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the hoster ranges may have some good bots or exits for corporate proxies
That's why I'm reduced to two different IP-based lockouts. One goes

SetEnvIf Remote_Addr blahblah bad_range
BrowserMatch goodrobot !bad_range
leading to
Require env bad_range

while the other proceeds directly to
Require ip blahblah

I think the "Require ip" version is fractionally more server-efficient, but sometimes you do need to poke those holes.

iamlost

11:16 pm on May 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I show it first in mid December and since has come in waves (sent straight to null with IP blocked for 1, 12, 24, 72 hours) every couple of weeks, the last the first week of May. Quite a stupid if persistent botty mcbotface bot coming from all over; definitely has access to a significant botnet.

As I found, when went looking back before Christmas, the exact identifying spelling errors in several RCE (remote code execution) ‘test’ scripts - with the earliest being a 2016 [Util/PHP/eval-stdin.php] PHPUnit remote attack patched in PHP 4.8.28/5.6.3; see CVE-2017-9841, rated 9.8 critical - but (later) others targeting Ruby on Rails, Apache Struts, etc. I believe the initial script has been copy pasted and modified by same or other crackers.
  
try:
eNv = "{}/.env".format(url)
headers = {
            'Connection': 'keep-alive',
            'Cache-Control': 'max-age=0',
            'Upgrade-Insecure-Requests': '1',
            'User-Agent': 'Mozlila/5.0 (Linux; Android 7.0; SM-G892A Bulid/NRD90M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/60.0.3112.107 Moblie Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
            'Accept-Encoding': 'gzip, deflate',
            'Accept-Language': 'en-US,en;q=0.9,fr;q=0.8',
        }
 
        rsmTP = requests.get(eNv, headers=headers, allow_redirects=True, timeout=50)

After reading this thread I did a search finding that apparently starting in November 2019: The Resurrection of PHPUnit RCE Vulnerability [imperva.com] it was resurrected.

The code targets multiple vectors for both Linux and Windows. Crypto-mining used to be the raison d’ętre, not sure what this revival is in aid of.

jmccormac

12:44 pm on May 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A combination of iptables and httpd.conf seems to be a good way of dealing with the IP problem. There is a useful option in iptables that can block on keywords/phrases.

Regards...jmcc

lucy24

4:39 pm on May 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



headers = {
This sent me to logged headers, which I hadn't looked at earlier. Yup, there they all are, claiming to speak French. On my site the Connection header comes through as
Connection: close
but I think this has to do with the server, since it's present on all requests without exception. Everything else seems to be as scripted--right down to the Dnt header, which at one time was fairly diagnostic of humans, but is now popular with robots too, darn it.

jmccormac

4:42 pm on May 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The language combination sounds a bit French Canadian. OVH.ca?

Regards...jmcc