Forum Moderators: open

Message Too Old, No Replies

User-Agent: Mozilla

Invalid UA - is it valid?

         

dstiles

10:29 pm on Aug 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As far as I've seen so far, going back a year or so, any UA beginning with "User-Agent:" is either a bot or, recently, AVG/LinkScanner. In the latter case the full UA is predictable. In the case of a bot it seems arbitrary within the common range of "standard" MSIE browser UAs.

I have recently seen an increasing number of these prefixes and they're being trapped as unwanted bots. However, the behaviour and general UA make-up seem to be valid browsers - actually, browsers behind, perhaps, a badly behaved firewall. Certain expected headers are either missing or bot-like.

As far as I've checked, the IPs are "domestic" rather than datacenter and often are UK (our server is UK-based).

Typical UA is...

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0; .NET CLR 2.0.50727)

Of 69 occurrences so far this month, 24 have included "Media Center PC" in the UA; may be just coincidence. All have begun "User-Agent:Mozilla/4.0" )no quotes).

Each occurrence is singlular, possibly because it's got back a 403 or 405 but possibly because it's some kind of auto-checker. The request comes into the middle of the site, not the home page, and has no referer.

I'm still inclined to believe it's a bot of some kind but I wonder if it might be some kind of bookmark tool or linkchecker addon.

Any thoughts, please?

keyplyr

9:12 am on Aug 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I concur that most any UA beginning with "User-Agent:" is a bot, scraper or spammer tool. On my (USA) server the behavior is the same as you describe. I mostly see this UA attempting to access pages with input forms, something spammers would do.

I also believe that some download tools offer fields to spoof various UAs, but fail authenticity due to leaving the "User-Agent:" present :)

Other than AVG/LinkScanner, I have no theory whether any are authentic human users that somehow are acquiring this UA by whatever means. The firewall scenario seems plausible.

wilderness

2:59 pm on Aug 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The only "valid" UA that I'm able to recall (or locate) which uses the leading portion?

Is an RSS image bot, which does NOT crawl randomly, rather, chase links which have been provided in their forums.

"User-Agent: BoardReader-Image-Fetcher /1.0 info@boardreader.com"

dstiles

10:31 pm on Aug 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



These hits seem to be only on ordinary pages, mostly home pages. Possibly from google/etc but if so it's lost the referer. And that could be via a banjaxed browser, bot or proxy.

Receptional Andy

10:48 pm on Aug 8, 2008 (gmt 0)



I concur that most any UA beginning with "User-Agent:" is a bot, scraper or spammer tool

I noticed recently that a (rarely used) copy of IE7 on one machine I use now has an invalid UA prefixed with "User-agent:". I don't run any of the usual suspect AV tools or toolbars, and I've yet to track down the cause of it. It isn't a firewall, but some software that has (badly) modified the UA. If I work out why, I'll post back.

dstiles

10:58 pm on Aug 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Interesting possibility, Andy.

Perhaps a variation on or allied to the Mozilla/4.0 (compatible;) (exact & complete UA) I get so many of? I've always considered them to be some kind of browser hiccup since they occur in the midst of normal UAs for a given IP and tend to be singular. Been going on since pre-MSIE7 I think.

Samizdata

5:19 pm on Aug 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



some software that has (badly) modified the UA

I do lean towards the view that this is not a bot but a badly-programmed browser add-on that corrupts the Windows Registry entry - apart from AVG LinkScanner, the only program I have seen reported doing this is the IE7Pro plugin, a buggy piece of software that includes a web accelerator and a user-agent switcher (amongst other "features").

I don't get enough of them to worry about, but if you are concerned that you are blocking human visitors you might redirect anything starting with "User-Agent" to an interstitial page and see what they try to do from there (and include a hidden link to a spider trap).

You might also try detecting IE7Pro by ActiveX Object.

...

Receptional Andy

2:47 pm on Aug 12, 2008 (gmt 0)



I think I tracked down the cause of this in my case - it was some component that was utilising the TEmbeddedWB component from bsalsa.com (for embedding IE in other apps, I believe). I checked the source of that app, and it erroneously inserts "User-agent:" in front of the UA in the windows registry.

Now I'm not sure which component I had that uses this library, but I have a suspicion it might even be the IE Tab plugin for Firefox.

dstiles

4:33 pm on Aug 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not bsalsa in my case, or not noticably so - see original message.

Why on earth would anyone want to simulate IE in Firefox!?! :)

wilderness

4:48 pm on Aug 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Why on earth would anyone want to simulate IE in Firefox!?! happy!

Why would anyone deisre AOL (and others) as their internet provider?
Or, Amazon (and others) as their web host?
Or, installing many of browser tool bars?

Some folks are just CLICK happy ;)
The viability of worms/viruses have testified to that.

Receptional Andy

6:53 pm on Aug 12, 2008 (gmt 0)



Why on earth would anyone want to simulate IE in Firefox!?

There's lots of good reasons. For a testing environment it's great - render with IE and firefox using only one browser instance.

I wouldn't class myself as "click happy" ;)

dstiles

7:42 pm on Aug 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Click-happy - yeah, but a Firefox user?! I'd have thought they'd have more sense.

And as a testing environment - you mean the Firefox plug-in simulates all of those nasty IE rendering bugs and atrocious CSS? Now that IS dedication to design! :)

Receptional Andy

7:46 pm on Aug 12, 2008 (gmt 0)



the Firefox plug-in simulates all of those nasty IE rendering bugs and atrocious CSS?

We're way off topic, but it's an embedded browser not a simulation. Single click to see the page rendered in IE. I know sites I work with have a lot of IE users. It's a useful plugin :)

dstiles

8:44 pm on Aug 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry, Andy - and thanks for clearing up the puzzle! :)

Samizdata

9:48 pm on Aug 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The few occurrences of the "User-Agent:" prefix I have seen lately appeared to be link pre-fetchers.

Apart from the prefix the UA remains unchanged for subsequent human hits, always IE7 on Vista.

Digressing, I notice that IETab claims to allow Firefox to access Windows Update.

I wonder how much user-agent checking Microsoft does.

...

dstiles

10:44 pm on Aug 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Didn't I read somewhere that MS had a deal with AVG on Vista? Could that be where you're seeing the pre-fetches?

I can't see Vista running MSIE-6 (does it?!) and that's a common one for me. I know I've been blocking it for a long time, not only in the form we're discussing but also without the hyphen, with a space, without the colon etc. The earliest I can locate this specific form is July 2006 when it came in as...

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

This is surely pre-Vista in any distrubuted release quantity. Apart from the "media center" UAs I mentioned at the top, the MSIE-7's seem to be split roughly between NT 5.1 (XP) and NT 6.0.

I have to say that UA is fairly close to one of the AVG versions, possibly a pre-AVG version? But then, a large number of other robots use the same MSIE UA.