Forum Moderators: open

Message Too Old, No Replies

Java in User Agent

Is it safe to block?

         

smallcompany

4:20 am on Sep 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If I block it in this way:

RewriteCond %{HTTP_USER_AGENT} Java [NC,OR]

Are there any UAs having “java” in their string that are worth of getting access?

I'm blocking it because of

User Agent = Java/1.6.0_05

and just want to be sure it does not get around by simply changing the version or whatever.

incrediBILL

7:27 pm on Sep 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's how I block it but some lazy or ignorant programmers writing RSS readers didn't bother to set the user agent to identify their tool.

My theory is that if enough of use block it someone will eventually fix the tool.

keyplyr

6:34 am on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



IMO almost all Java requests are bad. smallcompany, I also use mod_rewrite to filter Java requests with a few allowed conditions.

Just today I caught Network Solutions attempting to rip through my sites:


205.178.191.39 - - [07/Sep/2008:00:42:58 -0400] "GET / HTTP/1.0" 403 247 "-" "Java/1.5.0_11"
205.178.191.39 - - [07/Sep/2008:00:42:58 -0400] "GET / HTTP/1.0" 403 247 "-" "Java/1.5.0_11"
205.178.191.39 - - [07/Sep/2008:00:42:59 -0400] "GET / HTTP/1.0" 403 247 "-" "Java/1.5.0_11"

keyplyr

7:38 am on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are there any UAs having "java" in their string that are worth of getting access?

AFAIK, there are no other UAs containing "java" other than Java/*.*, however some companies that are generally considered beneficial do use this tool for file retrieval. Best to keep a white list of IP addresses as conditions to your rule.

Samizdata

3:35 pm on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have one site that serves Java content.

When a browser requests the Java content it changes the user-agent, often to one starting with "Java" but not always - here are a few examples I found in a quick check of recent logs:

Genuine IE6
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Mozilla/4.0 (Windows XP 5.1) Java/1.5.0_15

Genuine Firefox
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Mozilla/4.0 (Windows XP 5.1) Java/1.4.2_09

Genuine Google Chrome
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.27 Safari/525.13
Mozilla/4.0 (Windows XP 5.1) Java/1.6.0_10-rc

---

The site also caters for handheld devices and I found this one:

Huawei/1.0/U120/B000 Browser/Obigo-Browser/Q04A MMS/Obigo-MMS/Q04A SyncML/HW-SyncML/1.0 Java/QVM/4.1 Profile/MIDP-2.0 Configuration/CLDC-1.1

That was not a request for Java content, but appears to be the standard user-agent.

...

dstiles

8:18 pm on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does the UA change ONLY because you serve up java or could it change for other reasons / ad hoc? Presumably if a site does not serve up java the UA will never show the Java substring?

I found a similar "mobile" UA yesterday...

LG/KU990-Orange/v10f Browser/Obigo-Q05A/3.6 MMS/LG-MMS-V1.0/1.2 Java/ASVM/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1

The UAs for some of these devices are abysmal. I've let this one through for now even though it has java in the UA but it's on suspicion alert.

Does anyone know WHY it (presumably) has Java capability?

Samizdata

8:55 pm on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does the UA change ONLY because you serve up java

Yes - the desktop browsers use the normal user-agent for almost* everything else.

Obviously if you have no Java content you will never see them.

*Some other multimedia (ShockWave, RealPlayer) can also produce UA changes.

The UAs for some of these devices are abysmal

I notice that the common factor here is the Obigo browser - I wouldn't say it was very common.

Does anyone know WHY it (presumably) has Java capability?

Many cellphones play elementary Java games. Support for other multimedia is patchy.

...

jdMorgan

9:31 pm on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Many mobile applications -including mobile browsers such as Opera Mini- are written in Java. The "Profile/MIDP-2.0 Configuration/CLDC-1.1" is a very good indication that this is a mobile device, as is the "LG" brand and U990-Orange model name.

According to a quick Web search, Obigo is the standard browser for LG phones.

Best to start-anchor "Java/" blocking patterns, and not block UA strings which only contain "Java" without having further cause to do so.

Jim

Samizdata

10:17 pm on Sep 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I once compiled a list of mobile user-agents that visited my site - over 750 unique UAs.

There were only three that included "Java" and all were Obigo browsers.

But I feel pain every time I block a human, and always value Jim's advice.

...

smallcompany

4:26 am on Sep 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think Jim sounds right. Starting it with "Java/" should cover only those that caused me to start this post.

Now, are there any good ones that start like this?

some companies that are generally considered beneficial do use this tool for file retrieval

Which companies? Would anyone know the ratio between bad and good?

Thanks

keyplyr

8:27 am on Sep 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry to say, you'll just have to watch your logs and find out. One man's good bot is another's intruder.

I batch several similar UAs in one rule, then include a white list of allowed IP address, but don't remember which IPs go with which UAs, sorry.

dstiles

1:22 pm on Sep 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Samizdata - thanks for the info.

Start-anchor java/ - there seems to be a tendency with some libww UAs to have random characters at the beginning, so with an anchor of ^libww (whatever) they would not be trapped. I suspect this may extend to other common UAs, although I have only noticed the trend with standard Mozilla MSIE UAs so far. I'm inclined to treat java the same way as nutch - kill by default, whitelist known goodies.