Welcome to WebmasterWorld Guest from 35.172.100.232

Forum Moderators: Ocean10000

Mb2345Browser/9.0

I'm pretty sure this is actually a crawler, not a browser

     
11:24 pm on Jul 31, 2019 (gmt 0)

New User

joined:July 20, 2019
posts: 13
votes: 0


This UA has been showing up with a bunch of 404's.


Mozilla/5.0(Linux;Android 5.1.1;OPPO A33 Build/LMY47V;wv) AppleWebKit/537.36(KHTML,link Gecko) Version/4.0 Chrome/42.0.2311.138 Mobile Safari/537.36 Mb2345Browser/9.0


It escapes stuff that oughtn't be escaped. For example, there are pages with paths like 'SomeScript.aspx?id=NNN' (they're all legacy redirects, actually, but the site used that scheme for a long time and still gets plenty of hits to that path). It requests 'SomeScript.aspx%252525253Fid%252525253DNNN' instead, which you'll notice is not only wrong, it's quadruple-escaped! I also see double-escaping in my logs, but quadruple is funnier.

I'm also pretty sure it's actually a crawler, because it keeps requesting different legacy URLs constantly without ever visiting the home page. Plus, it claims to be based on Chrome, but it obviously has bugs that Chrome doesn't.

According to whois, it all seems to be coming from China.
2:04 am on Aug 1, 2019 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Nov 25, 2003
posts:1338
votes: 432


mb2345browser identifies mobile browser for Chinese web directory 2345.com.

Once upon a time 2345.com ran a notorious adware browser hijacker but the browser appears to be more legitimate :)
3:51 am on Aug 1, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15804
votes: 845


I'm also pretty sure it's actually a crawler
:: detour to raw logs ::

The fact that the only request I find is for a page without supporting files is also strongly redolent of robotitude.

:: further exploration ::

###! I forgot to update the relevant site's htaccess for /includes/ when the site went to 2.4, so I can't check headers and confirm what grounds got them blocked. But they did get blocked.
9:21 pm on Aug 1, 2019 (gmt 0)

New User

joined:July 20, 2019
posts: 13
votes: 0


mb2345browser identifies mobile browser for Chinese web directory 2345.com.


According to [handsetdetection.com...] the real 2345browser has "like Gecko" in its UA string. This one has "link Gecko".
11:04 pm on Aug 1, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15804
votes: 845


This one has "link Gecko".
LOL. My Deny list includes a whole string of misspellings, not only in the UA but in header names: “Referrer” *, “Useragent” (not to be confused with the UA string that begins with the literal text “User-Agent:”), “X-Fowarded-For” ...


* Hmph.
8:21 pm on Aug 2, 2019 (gmt 0)

New User

joined:July 20, 2019
posts: 13
votes: 0


Welp, I added a rule blocking based on "link Gecko". Apparently, getting 403's caused to send in the clones, and started crawling so hard it caused noticeable slowdown on the site! (literally the thing I was trying to prevent... sigh)

So now I added a block rule based on the "%253F" and "%2525" (404's this time). It's still hammering me from a wide variety of UA's and IP's (enough that my IP-based rate limit doesn't kick in), but most of them are reaching my application backend. Eventually, it'll at least give up crawling the broken URLs.
1:13 am on Sept 10, 2019 (gmt 0)

New User

joined:Sept 10, 2019
posts: 1
votes: 0


I added blocks to try to direct 403 errors for some of this mess, and fail2ban did absolutely nothing. I guess my logs are not to fail2ban's liking. It only made them angrier. They started flooding ten times as many requests, and a lot of them more "properly" formatted.

I have no idea how to make Nginx outright drop all traffic requests with "PHPSESSID" in the URL. I block it with tests for $arg_PHPSESSID, and it still gets through from the idiots sending requests for "/index.php%3FPHPSESSID%3D..."
2:51 am on Sept 10, 2019 (gmt 0)

New User

joined:July 20, 2019
posts: 13
votes: 0


My Nginx has these in it. You can add "%3FPHPSESSID".


if ($request_uri ~* "%2525") {
return 404;
}
if ($request_uri ~* "%253F") {
return 404;
}
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members