Forum Moderators: open

Message Too Old, No Replies

More X11 / Ubuntu old-Firefox Activity

         

Bubalo

11:52 am on Jul 8, 2023 (gmt 0)

Top Contributors Of The Month




Hi all.

I am a new member and this my first post.

I am "web mastering" ( a steep learning curve for me) my own personal web site that features some of my artworks and photographs.

I have visited Webmaster World a few times before joining for helpful guidance - particularly about - the User Agent abuse from - Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0.

I am pretty sure, from evidence behavior on my site (but I could be wrong) that this U/A it is a scraper (possibly human but probably a unnamed/unknown bot) that is behind this user agent. It ignores my robots.txt file block request. It switches IP's frequently - most IP's show up in AbuseIPDB website as known dodgy IP's - but some IP's it uses are alarming - the latest being the French Atomic Energy Agency! - A lot of Universities/Schools, Cloud Proxies, and Amazon Aws.

The reason I think this is a scraper is from log reports - here is an example:

8 Jul 2023, 01:34:47104.219.213.35GET1.1200162,241425Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
8 Jul 2023, 01:31:5844.229.15.165GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
8 Jul 2023, 01:30:2844.229.15.165GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0

When is encounters a 403 Response Code (my ip block) - it switches IP - to something new and when is gets a 200 Response Code it then takes any thing from a few hundred Time Taken (ms) to over two thousand Time Taken(ms) to GET what it wants. It usually does this in blocks of 3 attempts and maybe only 5 or 6 attempts in one 24 hour period before moving on to a different target on my website. It seems to be concentrating on GETTING individual images (.jpeg) (hundreds of them on the site). There are only 13 different HTML pages on the site. There is no advertising on the site and it is NOT a commercial site as nothing is offered for sale on the site.

Of course I wondered at first was this U/A legitimate soon after it appeared a few months ago so when I noticed this Forum message about the botnet coming back I became more suspicious. As I blocked the IP's it just seemed to switch to new IP's as fast as I blocked them.

I also noticed many of the IP's were associated with China, North Korea, Hong Kong but as I blocked these - the IP's switching went worldwide - USA, UK, etc. So I tried blocking the countries China and HK - and then there was a marked increase in the U/A string using international IP's.

So far I have blocked probably a hundred different IP's and incidences now seem to be slowing down - most now come out of the USA.

I have not used the .htaccess file to attempt to block as I am pretty sure X11 the U/A will ignore that too.

I few days ago I decided as an experiment to lift the county block for China and -- I got over 30 hits from x11 in 24 hours - so I blocked China again. I don't get any audience traffic from China - other than hosting companies like 10 cent so I thought no great loss of traffic and so worth a shot to see what happened.

So. X11 seems to originate form China but what is behind it?

I notice on GitHub A LOT of people learning or using scraping use the X11 user agent string - and there is advice there for them to switch it often to another UA !

Legitimate traffic to my site does not seem to be down much and it usually fluctuates up and down anyway - but I do fear the X11 trouble could get much worse as others posting on WM world have indicated has occurred on their web sites. I don't want this to happen to me. My host does not have a anti-scrape tool, yet, And *loudflare has other problems I don't want to touch

I thought to post my experience here and welcome all comments and suggestions from you guys who are more experienced.
Thanks.




[edited by: not2easy at 1:55 pm (utc) on Jul 8, 2023]
[edit reason] split thread cleanup [/edit]

SumGuy

2:04 am on Aug 10, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



> this triggered much more activity against my site with X11 using hundreds of different
> IP addresses from all over the world. Sometimes the X11 appearances in my logs
> looked like DDOS attacks. Large blocks of IP's. Some of the countries it was coming
>out of were alarming too, like North Korea ! Some of the IP's too like French Atomic
> Energy Commission.

Your web site (or more technically, your web server) is being used to hassle other end points. Because you issue 403's or other error responses based on well-known rules (rules against UA's that stick out like a sore thumb), malicious actors are forging requests to your server using IP's that they want to DOS. I too have seen attempted web hits from various governmental and institutional IP's in the EU / Ukraine / Russia over the past year and I know that, for example, the parlaiment of Ukraine is likely not actually trying to surf my website.

I think its pointless to respond with 4XX to http requests that your server is blocking based on rules. Maybe once upon a time the response made sense, but today I think it's likely the response is going to an innocent and uninvolved endpoint, and you are bogging down their internet connectivity. Instead of issuing a response, just drop the connection. Be silent about it. When you broadcast a 4XX it gets picked up that your server is available to perform DDOS against someone.

lucy24

5:39 am on Aug 10, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think its pointless to respond with 4XX
What's the alternative? It sounds as if OP is on shared hosting, which limits the options. An individual site can choose what response to send, but you can't choose whether to respond at all.

Interesting that we're now thinking about both forms of DDoS attack: the ones targeting the recipient of a request, and the ones targeting the putative sender of the request.

Bubalo

9:42 am on Aug 10, 2023 (gmt 0)

Top Contributors Of The Month



Thanks for replies guys. Tangor: There have been hundreds of different IP's used over last few weeks since I have been monitoring and started this post. A few days ago I had a quick look at my logs going back about six months ago and there is evidence there of other X11 IP addresses being used but I have not wasted any more time looking back and counting each different IP. (Thousands) - my parenthesis - is no exaggeration but my point being this thing appears like it can use any IP address it wants and rarely uses the same IP more than 3 times.

Lucy, Not2Easy, I am NOT blocking by IP but I am blocking as previously recommended - see above.

Tango: I never underestimate the cunning of thieves but (am sure we all agree) it is a pity the bad actors don't put their inventiveness the better uses. I got a "feeling" that if this is not a machine bot then its a sad person with a problem of their own delusion. My site is as inoffensive as me but these days some take offence at everything their troubled minds don't understand. I think you guys know where I'm coming from... :-).

I mentioned above how some bad actor (in StackOverFlo posting) had created new text and added this to the end of the original X11 string to 'make it look different and avoid the scraper block' - and this "variant" of X11 was used against my site. I blocked it. Which made me think - X11 is not a bot. X11 is also not directed solely at my site - X11 is a pretty common form of scraper which others have reported on their sites. As I mentioned above, it might be a fake FBook page or some malfunctioning software/hardware somewhere. By hiding/cloaking itself its not playing the fair game but cheating under hand - which is suspicious.

If it is a web crawler bot why is it hiding its ID if not cloaking its malicious intent? Probably just going to such lengths to covertly hide very bad intentions. As Not2Easy mentioned above - over 40% of web traffic is bots and then most website users don't bother monitoring their logs - as we do... :-)

I am posting here just to pass on my experience about X11 to be community helpful in combating such nuisance. Perhaps there are clues in my postings that might ring warning bells and so inform future understanding by others to make decisions about how to mitigate against such bad actors. I live in hope but have still got to mow the lawn, have fun, enjoy my life and just get on with creating and posting new content - which is perhaps the best way to help heal the sad people.

SumGuy

12:26 pm on Aug 10, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



>> I think its pointless to respond with 4XX
>
> What's the alternative? It sounds as if OP is on shared hosting, which limits the options.
> An individual site can choose what response to send, but you can't choose whether
> to respond at all.

I'm not familiar with what you can run on a hosted server, presumably you can run anything you want, I would assume Apache for a web server? I myself haven't investigated how I would configure my software (Abyss web server running on windows) to drop requests based on UA that I don't want to respond to, but I would think it should be possible - no? Or is that a use-case that hasn't been contimplated by web server software authors?

> Interesting that we're now thinking about both forms of DDoS attack: the
> ones targeting the recipient of a request, and the ones targeting the
> putative sender of the request.

I think there has to be some explanation why there is persistent use of odd-ball UA's that are trivial to detect. Wouldn't you want to deploy a scraper that knows how to mascarade itself by using a current UA if you want results and not 4xx? Unless the alternative is - your bot is using a "hated" UA for a reason.

One thing not a lot of web server operators see - when you reject in the router based on IP, the number of attempts from a given IP when you are silent dropping in the router. On the SMTP side, the number 20 or 21 is extremely common number of attempts for a bot to make before giving up. On the http side I frequently see 10+ attempts. How many attempts does a real human browser make to a non-responding site before it gives up? Is there a difference between FF vs Edge vs Chrome vs Safari? I don't know.

SumGuy

1:05 pm on Aug 10, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



Does anyone ever see the X-Forwarded-For item in response headers? Is it possible to see both the source IP and the Forwarded IP (if present) in your logs? Presumably that information would tell us when IP spoofing is happening during an HTTP/HTTPS request.

I've just been looking into the question as to if an IP can be spoofed when making an HTTP request, given that we're talking about TCP. The answer seems to be no, but I've come across one comment where "if the request can fit into a single packet, then yes". Then I've stumbled across this X-Forwarded-For header item which is a completely different animal.

SumGuy

1:39 pm on Aug 10, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



Apache web server

You can configure an Apache web server to extract the IP address from the X-Forwarded-For HTTP header and log that IP address to the web server log file by adding the appropriate logging directives to the main Apache configuration file (typically named httpd.conf) or to the relevant virtual host configuration files.

For example:

LogFormat "%v %{X-Forwarded-For}i %l %u %t \"%r\" %>s %b" X-Forwarded-For
CustomLog /var/log/apache/www.example.com-xforwarded.log X-Forwarded-For

Bubalo

2:37 pm on Aug 10, 2023 (gmt 0)

Top Contributors Of The Month



>>I think its pointless to respond with 4XX to http requests that your server is blocking based on rules. Maybe once upon a time the response made sense, but today I think it's likely the response is going to >>an innocent and uninvolved endpoint, and you are bogging down their internet connectivity. Instead of issuing a response, just drop the connection. Be silent about it. When you broadcast a 4XX it gets >>picked up that your server is available to perform DDOS against someone.

Are you suggesting I lift the .htaccess blocking?

I am on shared host - I can create .htacces rules (these work) and .robots file entries to block (these are mostly ignored but if I name a particular spider/crawler (eg Baidu, they mostly do obey) and I can use a security blocking tool for individual IP's and IP ranges and I can block Countries I don't want traffic from. IP/range blocking works but bad actors either go away or switch IP providers to something else - so that just goes around in a circle so .htaccess is best block but it does create a lot of 403 responses which does concern me in case these 403 act against my site listing - which may be another way a bad actor is hassling my site - to get it demoted?. The host blocks other bad actors with 404.

Interesting to note the rate of X11 hits has dropped over last few days - which made me think that it was giving up - but today I got more hits but interestingly most from USA (Oracle) using only 2 different IP's.

lucy24

5:23 pm on Aug 10, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not familiar with what you can run on a hosted server, presumably you can run anything you want, I would assume Apache for a web server?
Gosh, SumGuy, have you always had your own server?

There’s a difference between shared hosting and a VPS. On shared hosting, you have no access at all to the configuration file, and have no control over which server software--and which modules within that version--is used. You're simply one of many sites on someone else's server. Even if you proceed to the next step, a VPS, I'm pretty sure you can't do anything equivalent to a firewall.

Bubalo

7:06 am on Aug 11, 2023 (gmt 0)

Top Contributors Of The Month



I thought it might help to show some examples of the "random" IP's - This is much reduced list of random IP's because the IP hits are steadily reducing after the .htaccess block was put in place. It does not seem to learn it is being blocked so it is now just a persistent nuisance.

I trust this post listing is not information overload but I thought it might interest to note the behavior and time stamps between the to and fro shifting of IP.

I did try to color code or embolden the listings to make it easier the see the pattern but there seems to be no way to do this in forum postings

11 Aug 2023, 04:19:13129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 03:04:08129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 03:01:07129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 02:59:56131.107.1.191GET1.140316,3697Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:55:38129.46.96.20GET1.140316,3694Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:46:45129.46.96.20GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:43:57129.46.96.20GET1.140316,3694Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:40:48129.46.96.20GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:38:43129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:31:38129.46.96.20GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:25:01129.46.96.20GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:18:15129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 01:12:52129.46.96.20GET1.140316,3697Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 00:50:45129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 00:16:33129.46.96.20GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 00:13:09169.228.63.97GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, 00:12:32129.46.96.20GET1.140316,3693Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 14:24:49170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 12:25:49170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 12:24:18141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 11:35:29141.193.68.10GET1.140316,3698Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 10:39:33141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 10:38:50141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 10:35:41170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:58:1754.160.135.211GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:57:45170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:48:36170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:43:55141.193.68.10GET1.140316,3697Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:43:17141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:22:08141.193.68.10GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:13:57141.193.68.10GET1.140316,3694Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:09:40141.193.68.10GET1.140316,3698Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 08:45:38141.193.68.10GET1.140316,3698Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 08:36:37170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 08:30:24141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 08:03:41141.193.68.10GET1.140316,3697Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 07:19:21170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 06:51:29141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 06:40:53141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 06:28:35170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 06:21:18170.106.115.252GET1.140316,3690Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 05:58:53141.193.68.10GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 04:33:01141.193.68.10GET1.140316,3695Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 04:27:13141.193.68.10GET1.140316,3697Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 18:57:348.219.6.6GET1.140316,3694Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 16:37:435.195.0.145GET1.140316,3696Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, 09:50:0291.201.7.254GET1.140316,3697Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0

Bubalo

7:47 am on Aug 11, 2023 (gmt 0)

Top Contributors Of The Month



The above was still not easy to read so I posted again below with column heads and \ separators to help :-)

TIMESTAMP REMOTE HOST METHOD HTTPv RESPONSE BYTES TIME USER AGENT
11 Aug 2023, \ 04:19:13\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 03:04:08\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 03:01:07\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 02:59:56\ 131.107.1.191\ GET\ 1.1\ 403\ 16,369\ 7\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:55:38\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 4\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:46:45\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:43:57\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 4\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:40:48\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:38:43\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:31:38\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:25:01\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:18:15\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 01:12:52\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 7\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 00:50:45\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 00:16:33\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 00:13:09\ 169.228.63.97\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
11 Aug 2023, \ 00:12:32\ 129.46.96.20\ GET\ 1.1\ 403\ 16,369\ 3\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 14:24:49\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 12:25:49\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 12:24:18\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 11:35:29\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 8\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 10:39:33\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 10:38:50\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 10:35:41\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:58:17\ 54.160.135.211\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:57:45\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:48:36\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:43:55\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 7\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:43:17\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:22:08\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:13:57\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 4\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:09:40\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 8\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 08:45:38\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 8\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 08:36:37\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 08:30:24\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 08:03:41\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 7\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 07:19:21\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 06:51:29\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 06:40:53\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 06:28:35\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 06:21:18\ 170.106.115.252\ GET\ 1.1\ 403\ 16,369\ 0\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 05:58:53\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 04:33:01\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 5\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 04:27:13\ 141.193.68.10\ GET\ 1.1\ 403\ 16,369\ 7\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 18:57:34\ 8.219.6.6\ GET\ 1.1\ 403\ 16,369\ 4\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 16:37:43\ 5.195.0.145\ GET\ 1.1\ 403\ 16,369\ 6\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
10 Aug 2023, \ 09:50:02\ 91.201.7.254\ GET\ 1.1\ 403\ 16,369\ 7\ Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0

not2easy

12:03 pm on Aug 11, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Unfortunately this looks like just a random selection of users running a scripted bot or even a downloaded sleazeware app working unknown to the IP user. Each of them would need to notice their lack of success before they might try another tactic. Not near the volume you might call a DDOS, more like gnats. I think that downsizing the 403 document could save some bandwidth. For example, mine is 515 bytes + a 98 byte image.

A couple of the IPs are server farms:
54.160.135.211 = 54.160.0.0/12 (one hit)
91.201.7.254 = 91.201.0.0/22 (one hit)

The most hits are from 141.193.68.10 (17) which is a small Arizona host - 141.193.68.0/24

SumGuy

12:32 pm on Aug 11, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



Your list boils down to 9 unique IP's. Qualcomm, MSFT, UCSD, Tencent, Tempest Hosting, Amazon, Alibaba, Emirates Telecom, G42 Cloud.

Yes, you are sending out 16k for your 403's. If I wanted to DOS someone with a spoofed request, that would be useful.

And yes, I've always run my site on my server. Something about an old dog and KISS.

It doesn't seem to be too hard to block IP's on hosted servers.




[edited by: not2easy at 12:49 pm (utc) on Aug 11, 2023]
[edit reason] Please see Charter/ToS [/edit]

Bubalo

12:51 pm on Aug 11, 2023 (gmt 0)

Top Contributors Of The Month



Not2easy, I think your scenario assumptions are probably spot on. It's not one bad actor (as I feared) - but many bad actors all using the same U/A string. I hope this is the case and its not one bad actor.

It feels very reassuring for me to have all the experienced eyes that have posted to me about this stuff of mine, All of your valued comments have helped illuminate what is a learning curve for me.

However, as i wrote before, this sample is a much reduced volume list of IP traffic hits from X11 that has reduced since the block was put in place some weeks ago. Those large log lists appeared to me like a DDOS. Probably more like just gnaty dread.

Thank you ALL for your time and help.
Have a great weekend all.

not2easy

12:52 pm on Aug 11, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



We discuss how to block in the Apache [webmasterworld.com] forum, as mentioned earlier in this thread.

Bubalo

1:02 pm on Aug 11, 2023 (gmt 0)

Top Contributors Of The Month



SumGuy - i only just spotted your post.

It seems to me like you are suggesting I remove the 403 .htaccess block. Please confirm.

I DO quite often get what I have assumed to be hack attempts where a single IP will make sometimes hundreds of different attempts to break in to the site configuration. I have never cross referenced any of those hack attempt IP's to see if they match any of the X11 IP's. I will do this now,

The 404 and 403 pages are created by the host. I haven't asked if I can use a smaller file size as i thought it was their prerogative.

SumGuy

1:29 pm on Aug 11, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



> It seems to me like you are suggesting I remove the 403 .htaccess block. Please confirm.

I don't know if that's the best answer for your case. What file(s) are being requested when you issue the 403? Is it always the same file(s)? Do they exist on your site? Are they large or small files? If you removed the blocking rule, would they just get a 404 because they're requesting a file that doesn't exist?

Is there a referrer URL for these requests? Sometimes a bot likes to plant URL's in log files by using the referrer.

I would say that removing the block in the .htaccess file would be useful only if your server could instead simply not respond at all to those requests. A sort of "silent drop" with no response. There may be no way to configure your site or your .htaccess to do that.

Is URL Rewriting possible on your server? In response to a hit from certain User-Agents, perform a URL Rewrite to a nul file or a very small file with no content.

Bubalo

5:47 pm on Aug 11, 2023 (gmt 0)

Top Contributors Of The Month



.htaccess blocks work. Blocking IP's against this particular U/A is a never ending task because it switches and abuses other IP's at random and there is a risk of blocking otherwise good IP's. I believe .htaccess is the way to go for me.

My site contains almost 600 different images, all low resolution small file size, and the U/A always and only targets these images specifically and by individual image names. Interesting, it still targets images that were withdrawn from the site some years ago and are no longer on the host server which tells me this is U/A has been hitting me for a very long time. I just didn't notice its pattern before.

What it was doing (still trying but much reduced) was targeting ALL the site image files over a 24 - 48 hour period but would spread these attacks/scrapes across several different countries and use numerous different IP addresses to get the image data it wants, and then over the next 24/48 period it repeats the whole scrape exercise AGAIN. This means the site is under constant daily attack. This is what is so unusual about the U/A and made me think its malicious.

It NEVER uses a referrer.

If I removed the blocking rule it would get access again to the image files because it ignores .robots file and spoofs different IP's so it stops using the IP's I have blocked and finds different IP's to attack. This is why only .htaccess works to block - but x11 hasn't gone away from attacking me even though its being blocked 100%. It's not learning. I blocked all python access after I noticed it would use a Python version U/A in the middle of a block of failed x11 scrapes.

I could ask the host about a silent drop or to a null/small file but I think if they granted permission I would have to write the instruction/rule myself. I don't have the expertise to do this myself and I don't think the host will allow me too much tinkering in case I do a Ralph.

I think x11 is either a scrape game for a group of like minded hacking individuals or its a single sophisticated bot that knows how to spread its attacks over different countries and use different IP's to "hide/cloak" itself. How it manages to do this using IP's from North Korea is beyond me as I thought that place was totally shut by Kim. Unless perhaps whoever/whatever is behind x11 is on very friendly terms with North K. and has access to their internet - does NK even have an internet? x11 does use a lot of Gov and Institutions IP's as already noted above.

lucy24

6:15 pm on Aug 11, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I could ask the host about a silent drop or to a null/small file
You don't need the host's involvement for a micro-file. Assuming you've got use of mod_rewrite (and if you don't, change hosts!) simply rewrite any and all image requests from the offending UA to something like a one-pixel gif. It's definitely less work for the server than that inexplicably gigantic 403 response. (The same can be done with hotlinks, though I prefer a horribly garish graphic that will punish the offender by making their site look bad.)

If in doubt how to do it, wander over to the Apache subforum.

Bubalo

11:02 am on Aug 13, 2023 (gmt 0)

Top Contributors Of The Month



Got it Lucy24
Thank you.

dstiles

8:59 am on Aug 15, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would make a case in favour of X11. It is, in itself, not a bad UA parameter, though firefox 72 definitely is - I block all firefox (and other) browsers over a few years old.

One of my browsers currently has the UA...

Mozilla/5.0 (X11; Linux x86_64; rv:114.0) Gecko/20100101 Firefox/114.0 which I filched a few months ago from a (then current) firefox browser. X11 refers to the linux OS architecture.

Bubalo

6:19 pm on Aug 15, 2023 (gmt 0)

Top Contributors Of The Month



Very good point. I think I remember reading one of your (?) other (earlier) forum posts somewhere in WW.com about your blocking browsers that are over a few years old, as a means of control.

Please share example details here of the rule(s) you have created to block the Firefox 72 element. This would be very helpful and much appreciated.

Do you have any more information about this particular X11 U/A string you could share here please?

Is it a bot (if so, which one and from where does it originate) or are these many different IP's its using evidence of cyber punks using the same U/A string to have a laugh scraping.

From insights I have gained recently about this X11 U/A issue, it seems clearer to me now that the whole U/A string thing has got more holes in it than a string vest and urgently needs fixing/dumping!

I have also seen this U/A debacle described on another authoritative website as a "whole convoluted mess that not even Google seems to want to get to grips with, and its own U/A for Chrome claims to be many things that are NOT true!"

I was thinking of posting these web page links (not my site) in this forum to perhaps open the discussion but I thought this would be regarded as spamming (?) and so I resisted. You guys probably know all about this U/A mess already but I am just beginning to understand and this is why I appreciate all your posted comments.

Furthermore, there are other very specific details of the X11 string (with a different browser version number) causing havoc to a particular web site and how a webmaster used a very similar few lines of code rule as discussed (above) here, to block it to save valuable content and bandwidth.

I now think most of the X11 U/A string discussed is false and if so the browser version bit might just be the bit of this dragon tail to grab.

Please appraise.



not2easy

7:09 pm on Aug 15, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



There's an OULDE thread on how to block old browser versions, though it is so old, it is mostly about old msie versions. If you're curious: [webmasterworld.com...]

(Note - this stuff is all pre-Apache 2.4)

dstiles

9:03 am on Aug 16, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



> rule(s) you have created to block the Firefox 72 element
My examples do not use htaccess so diferent syntax. Just block replacing x11 with something like:
firefox/\d\d?\.
That blocks all double and single numbered versions - it's well into triple numbers now.

> Is it a bot
X11 refers to the OS architecture and may be used in bots and browsers alike.

> ... U/A string thing has got more holes in it than a string vest and urgently needs fixing/dumping!
That statement applies to 90% of the internet, which is a complete shambles. It was never designed for the use it's getting and was often cobbled by people with little insight or method.

> U/A debacle
There is a move to deprecate UA strings per se in favour of the new UA-CH (User-Agent Client Hints) system. Not sure how far this has got, though.

> most of the X11 U/A string discussed is false
It's probably valid for most of the string but not the part:
Mozilla/5.0 (X11; Ubuntu; Linux x86_64;

From rv onwards it's the version and other details that are probably faked.

A simple method of applying a UA string to a bot or browser is to copy one from a real browser, which is sometimes necessary even with good browsers. I use Falkon for a lot of my browsing, a great browser in many ways but one of its failings is a rubbish out-dated UA that is not accepted by some web sites (mine amongst them).

Bubalo

2:29 pm on Aug 17, 2023 (gmt 0)

Top Contributors Of The Month



Thank you dstiles - did you you place your shortened block text in the robots file then?

dstiles

1:51 pm on Aug 19, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The robots.txt file is useless for most purposes. A good bot may take notice of it, but there are so many bots now that it would requiree an almost dily update even for bots that take notice of it. And as for the hundreds of bots that never even look at it. What robots.txt SHOULD have been is an htaccess-like system that really blocks baddies.

I have a few things in robots.txt to encourage bing, google etc whilst discouraging things like mediabot and previews. I have about a dozen disallows for long-established bots such as m12, Other than that I have no interest in robots.txt.

I have a whitelist of UAs, valid IP ranges per good bot, checks for Accept, Language, Protocol etc which is common to my approx two dozen web sites. I include the file in each apache site definition so I only have one file to maintain. The file itself uses a variety of such things as setenvif, BrowserMatch and other apache terms. (Thanks to Lucy and others for putting me on this track a few years ago! :) )

I use htaccess almost exclusively to handle group and individual page redirections (eg for alternative/bad spellings) and for headers such as Content-Security-Policy.

Bubalo

9:22 am on Aug 24, 2023 (gmt 0)

Top Contributors Of The Month



Thanks for posting dstiles but you haven't quite answered my question...you wrote...

>My examples do not use htaccess so diferent syntax. Just block replacing x11 with something like:
firefox/\d\d?\.
That blocks all double and single numbered versions - it's well into triple numbers now.

My question was ....
>did you you place your shortened block text in the robots file then?

Perhaps I should have written - if you didn't use the htaccess file to block it - where do you suggest placing something like: firefox/\d\d?\. to block it ?

lucy24

4:39 pm on Aug 24, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's starting to sound as if there is a little confusion around what htaccess is and isn't. The question is, first, is it an Apache server, and, second, what version (2.2 or 2.4)? There are a handful of directives that can only be used in the config file, not in htaccess, and in some modules there may be tiny differences in how you express the pattern-to-match. But where access control is concerned, it's all the same, either Allow/Deny or Require.

But at this point we are digressing from the User Agent ID theme, and this thread is getting pretty long, so it might be more useful to continue the discussion next door in the Apache subforum.

not2easy

5:13 pm on Aug 24, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I agree with lucy24 that this discussion is no longer about UA identification and has evolved to blocking techniques.

If you have more about that X11 UA, you can post that here, but .htaccess and blocking techniques are not UA topics and belong in the Apache forum. There is a link posted above on August 11 to get you to the Apache forum. It would help there to know what version of Apache your host is using.

BTW, in that Welcome post, way up at the top, there is a link to the formal "Welcome" page where you can learn how to format quotes and use other forum features. ;)

Bubalo

10:17 am on Aug 27, 2023 (gmt 0)

Top Contributors Of The Month



I understand, however, there is more about this particular X11 string that is actually evolving as I type - it has somehow managed to evade the .htaccess rules that were put in place with your collective help which then went onto the size of the byte size of my 403 file - but as dstiles made a very good point about blocking just the very old Firefox versions and not the X11 part - which was why I was asking for specific rule blocking text - which I would still like to have please.

It seems now this particular X11 user agent string has managed to bypass both my .htaccess and my robots.txt file while I was awaiting reply from dstiles.

I still think this particular x11 user agent is much more sinister and powerful than it seems !

If you close this discussion while it is still ongoing then the context will get lost.

i will get back to luck re Apache versions, etc.

My host still does not know how this x11 has bypassed its system.


In short

Bubalo

10:34 am on Aug 27, 2023 (gmt 0)

Top Contributors Of The Month



In short, Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0 is a VERY sinister and powerful bot - as I think we have determined - the U/A string itself is probably false (U/A are so easy to spoof) but the U/A elephant in the room here is that this particular U/A string is it actually right now brute force scraping web content and has now apparently got past my host server !
This 63 message thread spans 3 pages: 63