Does the element "Mozilla" ever occur non-initially in legitimate human UAs?
I've found one possibility: Kik/22.214.171.124 (Android 2.3.6) Mozilla/5.0 (Linux; U; Android 2.3.6; en-us; SCH-R820 Build/GINGERBREAD) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
I can't speak to its legitimacy but it definitely seems to be human.
Contrariwise I meet a lot of robots that identify themselves as "Mozilla blahblah" in quotes-- i.e. non-initial Mozilla. And a lot of this kind of thing:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.0.3705) -- that is, one UA nested inside another. (Most often MSIE 6, which is handy because those are automatically redirected in any case.)
Setting aside the image-search possibility, is it safe to slap down a global block on .Mozilla ?
lucy, There use to be some useful links within each forum's library. The Apache forum offered a very useful one on regex. They've apparently all been removed and replaced with standard lines in all-libraries :(
The only exceptions in my logs today were:
ia archiver & msnbot-media
everything else begins with.
BTW I've a left of SetEnvIf from not sure when "^moz", note LC.
I may not have worded the question right. As a Regular Expression, .Mozilla means "the literal string 'Mozilla' preceded by any text at all" -- as opposed to ^Mozilla which means it's the first element in the string.
So I'm not looking for !^Mozilla things that don't begin "Mozilla" (a group that contains most robots, all of Opera, and an untold number of telephones).
I'm looking for .Mozilla things that say "Mozilla" after they have said something else-- including but not limited to UAs that say "Mozilla" twice, so at least one occurrence has to be non-initial.
Mobile UAs are variable and although some of the later ones obey the rules many do not. For example, many begin Opera or Samsumg or similar. If you know they are mobiles it's probably safe to let them in.
NOTE: Checking other headers is not a good way of validating mobile devices. It depends on what they are and how they are connecting. In particular mobiles via proxies (even legit proxies) can come in with some really odd header combinations. I've had to relax header-checking quite a bit for known mobile UAs, especially those using proxies.
Multiple Mozillas in a single UA can often mean "I've just installed a really amazing tool in my browser and it has no idea how to create a proper user-agent string." I see this a lot with toolbar extensions, including G's. I suspect MS may also screw it up when updating from (eg) IE7 to IE8. As a result I tend to go by other headers, using double Mozillas as a tie-breaker.