homepage Welcome to WebmasterWorld Guest from 54.211.73.232
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Is "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" legit?
KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4058096 posted 4:10 am on Jan 11, 2010 (gmt 0)

Is the UA "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" ever legitimate?

Looking through my logs at this user agent string. It seems it is only ever used by bad bots. Is there ever a case when this would be used by a real user?

 

caribguy

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4058096 posted 7:20 am on Jan 11, 2010 (gmt 0)

RewriteCond %{HTTP_USER_AGENT} SV1\)$ [NC,OR]
[.. other untrusted UA's ..]
RewriteCond %{HTTP:Accept-Language} ^$ [OR]
RewriteCond %{HTTP_REFERER} ^$
RewriteRule (.*) - [F,L]

Seems to do the trick...

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4058096 posted 1:24 pm on Jan 11, 2010 (gmt 0)

That looks like that would block an awful lot of legitimate users. In our zeal to block bots, we should also strive to limit collateral damage. Upon further investigation, I think there are occasional legitimate users with the UA "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", it's just that they are few and far between compared to the bad bots.

Pfui

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4058096 posted 6:29 pm on Jan 11, 2010 (gmt 0)

If you think a UA is "only ever used by bad bots," it's prudent to limit its collateral -- and potential -- damage.

Regs' recs in this forum are routinely reasonable and bankable re which UAs/strings are more block-worthy than not. Of course, if you're iffy, you can certainly review your logs over time to see whether certain blocks would be more prudent than not on your site(s).

If the former, you can: 403 the UA; 403 the suspicious IP(s); rewrite either/both to a custom error page with a graphic or otherwise obfuscated address (a good happy medium); 301 the worst to 127.0.0.1; etc.

Depending on who-what-where, I prefer a belt-and-suspenders approach. Which means the following Chinese botnet mini-assault on a single site yesterday was 403'd (in more ways than one):

124.115.1.*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 17:47:32
01/10 19:42:14

58.61.164.14*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 17:30:56
01/10 17:31:09

124.115.1.*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 17:18:58
01/10 18:03:41

58.61.164.14*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 17:10:09
01/10 17:49:26

58.61.164.13*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 17:03:37
01/10 17:42:40

58.61.164.4*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 16:33:53
01/10 17:08:20

58.61.164.3*
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
01/10 16:32:17
01/10 18:05:24

For me, "occasional legitimate users" simply cannot have unfettered access using an approx. 9 y.o., unpatched (& w/ bots, probably faked anyway) UA that's a botnet/Zombie fave.

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4058096 posted 8:08 pm on Jan 11, 2010 (gmt 0)

I looked through a day's worth of logs for this UA and while more times than not a given group of hits by a single IP address appeared to be bots, there were numerous instances of the hits appearing to be by real humans. The question is whether or not I'm willing to block a couple dozen real users per day over this issue. Maybe I'll wait another six months or so and then block this UA.

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4058096 posted 11:46 pm on Jan 11, 2010 (gmt 0)

I get "real" hits from this UA.

Check the other headers, especially the accepts. There are several combinations that are always (as far as I can tell) baddies.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4058096 posted 12:21 am on Jan 12, 2010 (gmt 0)

Add "Connection: close" header to the "Accept" headers as a strong indicator...

Jim

caribguy

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4058096 posted 1:02 am on Jan 12, 2010 (gmt 0)

@OP: Not blocking all SV1), only when UA is potentially bad AND (Acceptlanguage OR Referrer) is missing.

About 4-5 'real' users a week seem to be still enjoying a plain vanilla, never updated version of IE6 on PC's that do not have .NET extensions installed...

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4058096 posted 1:33 am on Jan 12, 2010 (gmt 0)

@OP: Not blocking all SV1), only when UA is potentially bad AND (Acceptlanguage OR Referrer) is missing.

Sounds like a smarter plan. Add a missing accepts header to the list.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4058096 posted 1:58 am on Jan 12, 2010 (gmt 0)

Referrer being missing is pretty unreliable; Any direct URL type-in request and many that pass through a caching proxy (e.g. *all* AOL and EarthLink users) will be missing the HTTP Referer header. So lack of a referer header is a "third-level/grey-area" factor, useful when used as only the final of all deciding factors.

Accept
Accept-Language
Accept-Encoding
Connection
... all of these are more useful.

Jim

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4058096 posted 2:39 am on Jan 12, 2010 (gmt 0)

Please explain the connection thingy more. I'm totally unfamiliar with this.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved