Forum Moderators: open

Message Too Old, No Replies

=Mozilla/5.0

         

Pfui

5:22 pm on Mar 27, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



1.) I've blocked the following UA since at least 2012 because of the leading equal sign:

=Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16

Before this year, hits were near-daily from IPs belonging to:

Grand Web Solutions, Inc.
205.237.88.0 - 205.237.95.255
205.237.88.0/21

The hits were widespread. I picked Project Honey Pot reports for neighboring IPs and they show just the fake =Mozilla UA. Numerically, the spread covers over 20 IPs:

205.237.88.137 [projecthoneypot.org...]
205.237.88.158 [projecthoneypot.org...]

Never a real person; never a hint of who, why or what-for.

Conclusion? Someone was systematically doing something.

2.) This year, here we go again. Exact same UA, same near-daily hits, but a different set of IPs, now belonging to:

Intelligence Network Online Inc. (Intnet)
198.252.32.0 - 198.252.63.255
198.252.32.0/19

And now the hits are wider-spread. Project Honey Pot reports again for neighboring IPs show just the fake =Mozilla UA, and this time the spread covers over 200 IPs:

198.252.44.5 [projecthoneypot.org...]
198.252.44.218 [projecthoneypot.org...]

Conclusion? Someone's systematically doing something. Still. (imho)

Other sightings?

lucy24

9:06 pm on Mar 27, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



because of the leading equal sign

I once met one beginning in "User-Agent: " which is even better. You have to assume someone copying-and-pasting from the Intro To Botrunning manual, and misunderstanding one step. I too have had the UA
$\=
blocked for ages (not sure if the \ is necessary but it can do no harm). But I just pulled up raw logs and yes indeed, there they are, flurry of recent activity:
198.252.44.10 - - [26/Mar/2015:01:01:37 -0700] "GET / HTTP/1.0" 403 2809 "-" "=Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" 
198.252.44.136 - - [26/Mar/2015:01:01:38 -0700] "GET / HTTP/1.0" 403 2809 "-" "=Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16"


:: further pawing through logs ::

198.252.44.abc (.9-14, .130-250)
205.237.88.abc (.143-146)
208.93.7.abc (.118, .122, only a few of these)

173.225.110.201
Uh-oh, that's got to be a botnet. There was only one of it-- the others have come over and over since May 2014, all from blocked ranges-- but it was just seconds away from a 198.252.etcetera.

:: detour to investigate ::

Free lookup is being coy, but I think it's
173.225.96.0/20
Outfit called Netwide, which doesn't sound 100% human. (There are server farms and colos in the immediate neighborhood, all in /20 pieces.)

keyplyr

9:33 pm on Mar 27, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FYI - netwide.org is a BB ISP, but one of its products is web hosting (like every other ISP)

wilderness

12:19 am on Mar 28, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FYI - netwide.org is a BB ISP, but one of its products is web hosting (like every other ISP)


If the provider chooses to mix their service provider customers with their hosting customers (not taking the steps to register the ranges separately), than they deserve everything they get.

Course everybody realizes I'm narrow-minded :)

keyplyr

12:31 am on Mar 28, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



...than they deserve everything they get.

Ever heard of biting the hand that feeds you?

dupres01

6:42 pm on Apr 11, 2015 (gmt 0)

10+ Year Member



How does one deal with ranges such as 207.90.0.0 - 207.90.63.255 [Intelligence Network Online Inc.] where the provider chooses to mix their service provider customers with their hosting customers? Banning the whole range doesn't seem right (especially when the range is in my home country), but what happens when the bot gets wise and drops the leading =Mozilla and adopts a proper UA?

lucy24

7:54 pm on Apr 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sometimes there's not a ### thing you can do beyond blocking by UA.
what happens when the bot gets wise and drops the leading =Mozilla and adopts a proper UA?

In the case of an infected human browser, it may simply send its own genuine UA in the first place. This obviously can't be blocked.

If a particular range has become extremely vexatious, but you don't want to categorically slam the door on possible humans, you can do convoluted things like first sending them to a "Sorry, I'm not sure you're human" page, at which point you set a cookie and let them go on their way. I've yet to meet a malign robot that followed-up on this kind of intermediate step, although an infected browser is obviously perfectly capable of using cookies, and it can't be that hard to write a robot script to send them along for verisimilitude.

wilderness may have other remedies, since he has a tightly restricted but highly motivated user pool (the kind that will send in frantic emails "Help, help, I can't get into the site!" where most people just give up and go away).

wilderness

8:32 pm on Apr 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



wilderness may have other remedies, since he has a tightly restricted but highly motivated user pool (the kind that will send in frantic emails "Help, help, I can't get into the site!" where most people just give up and go away).


My custom 403 (contains a link to an image of an email address in the CONTACT; all visitors are allowed accesss to the contact image (same method as robots.txt):
You have been prevented from viewing this site because the web server has specifically denied you access to it.

You may have been prevented from viewing this site because your web browser details or internet address look very similar to those of known or suspected email address harvestors or malicious website 'crawlers'.

In the event that you feel this denial is unreasonable and would like a possible solution, please
Contact Us
</end of 403>

It doesn't offer a solution for all, as some legitimate visitors wouldn't use a CONTACT link (on every web page) if it were a matter of life and death.
For the aforementioned visitors, you just suck it up and let them fall by wayside.
My widgets are quite different than most, and people are actually refereed to me (via email) in referrals from beyond the boundaries of my websites.
There are some legitimate servers (North American) that I simply will not allow access (Embarq is one), however I even have longtime acquaintances whom are stuck with Embarq, and the only method of solution I'm able to offer is that they create a custom UA, where I make an exception to the Embarq range based upon the custom UA (some web novices are unable to comprehend modifying their UA).

Pfui

9:07 pm on Apr 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Coincidentally -- or maybe not so much -- Intelligence Network Online (intellnetwork.net) just dropped by:

207.90.2.7
=Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16

They were blocked by UA, and in the future, will encounter a full CIDR denial (207.90.0.0/18) where the probably rare-as-hens'-teeth real person will see something akin to the kind of custom 403 wilderness describes.

lucy24

10:04 pm on Apr 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You may have been prevented from viewing this site because your web browser details or internet address look very similar to those of known or suspected email address harvestors or malicious website 'crawlers'.

In the event that you feel this denial is unreasonable and would like a possible solution

Hee. The equivalent part of my 403 page says
If you clicked a link on this page and got bounced right back here, it means the server thinks you are a robot. If you’ve got another browser, try it. If that doesn’t help ...

or (environmentally conditioned redirect page)
You’ve accidentally replicated the behavior of an undesirable robot, so we have to take this brief detour.
{ links here }
If you land on this page a lot, ...

Clearly we have different target audiences.
some legitimate visitors wouldn't use a CONTACT link (on every web page) if it were a matter of life and death

I'm always puzzled by the number of humans who go as far as the Contact page, and then don't do anything about it-- not even sending a direct email, which is presented as one option. If they didn't want to contact the site in some way, what were they there for?

keyplyr

10:22 pm on Apr 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



My custom 403 error page looks something like this. Behind the scenes I gather some info using a couple php scripts.

Forbidden
Permission to access this server is denied.

Possible Reasons:
* You are in violation of copyright.
* You are hiding your browser or user agent.
* You are using a tool or method not allowed.
* Your host/ISP has been banned for bad behavior.


Your IP Address: 12.345.67.89 has been logged.

Just a FYI - I would never include a link to any file on my server from a 403 error page.

wilderness

10:49 pm on Apr 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just a FYI - I would never include a link to any file on my server from a 403 error page.


keyplr,
It has and continues to offer an effective solution for my over-bearing restrictions.
It's simply a 2k GIF.

As with every other method, we do what's most beneficial or detrimental to our own sites.

keyplyr

1:21 am on Apr 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well you've determined this agent should be blocked from your files, then you give them a link right into your files. What's wrong with this picture?

wilderness

5:17 am on Apr 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well you've determined this agent should be blocked from your files, then you give them a link right into your files. What's wrong with this picture?


Gross exaggeration.
The visitor is denied access to the wesbsites, however allowed access to 403's, 404's, 410's, robots.txt and this contact.GIF.

It's simply another file added to the same method you use.
So what's your beef!

keyplyr

6:29 am on Apr 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't show them where I keep the goods :)

lucy24

7:12 am on Apr 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Who cares if they see the 403 page, or the Contact page, or the Legal page, or for that matter if they know that other pages exist (which they can easily find out by other means)? What matters is that they're not getting into the page with fifty linked images, or the one whose HTML alone is the better part of a megabyte, or the one bearing my own cherished words that I don't want someone else swiping, or ...

I think a lot of people forget what a 403 means to a human. For my first ten-plus years on the internet, I perceived 404 vs. 403 as purely "no page" vs. "no directory index". Literally: that was the only time I ever saw a 403 page. Ordinary humans don't know from scrapers and malign Ukrainians. They just know exploration and wandering around URLs and oh, oops, I guess I'm not supposed to be here, but where the heck in the path
example.edu/dir1/dir2/~dir3/dir4/faculty/dir5/dir6/personal/dir7/dir8/123456.html
do I find the next page? (keyplyr, I believe you've got some connection to academia. You know I'm not making this up.)

keyplyr

8:55 am on Apr 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lucy I agree some edu sites were built poorly in the past (many remaining this way) with pages seeming added layer upon layer without over-site, however most major schools now employ web dev staff. Too bad the library systems choose to remain in the Netscape yesteryear.

Expanding on my earlier opinion about 403s... mine is lean and to the point. It is built for unwanted robots and malicious tools. I capture lots of data behind the scenes, however without revealing security strategies, I simply do not give away paths for future exploits.

Now if in fact the offender is human, I feel I give them just enough information to understand why they arrived where they did. More than that would be futile since it is they that are at fault (this is in consideration that I serve a very informative custom 404 page complete with full site navigation and site search utility which effectively keeps misguided users on my site.)