Forum Moderators: open
Here's an example of the browser UA it's been using recently:
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; WOW64; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022)
Did not request robots.txt, just regular HTML pages. Got 403-kicked-to-the-curb because although it spoofed the User-agent, it didn't get any of the other MSIE 7 request headers right.
Bad form, MSN.
Jim
These hits do not always carry a referer although at least some of the hit groups begin with the common (and $%& annoying) live test referer q=#*$!.
Hitting one site very badly - by which I mean clumsily, requesting associated frames on every call which a browser wouldn't normally do but a bot might IF it were trying to be a browser.
Cannot comment on robots.txt as these results are from "bad behaviour" security logs.
The are getting through because I'm lenient - some of my clients' customers come in with similar bad headers via proxies, privacy software etc and clients get upset if you block their customers. :)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)
IP: 65.55.107.nnn 65.55.109.0-65.55.110.255
As far as I'm concerned LWP is permanently banned. At the rate MS are introducing new UAs on new IP ranges, albeit with rDNS claiming (incorrectly) that it's msnbot, they are also heading to be banned.
I'm currently thinking about an update to my trap to ban IP blocks selectively - lets 'em onto a customer site, kills 'em on my own sites. Could well be applied to MSN; and Yahoo isn't too far behind. Trouble is, I don't like (or trust) Google either. :(