Forum Moderators: phranque
Java/1.4.1_02 81.5.***.25 - - [07/Nov/2003:04:30:46 -0800] "GET /robots.txt HTTP/1.1" 200 1524 "-" "Wotbox/alpha0.5.1 (bot'at'wot***.com; http://www.wot***.com) Java/1.4.1_02"
81.5.***.25 - - [07/Nov/2003:04:30:52 -0800] "GET / HTTP/1.1" 200 20624 "-" "Wotbox/alpha0.5.1 (bot'at'wot***.com; http://www.wot***.com) Java/1.4.1_02"
81.5.***.25 - - [07/Nov/2003:04:30:54 -0800] "GET / HTTP/1.1" 200 20624 "-" "Wotbox/alpha0.5.1 (bot'at'wot***.com; http://www.wot***.com) Java/1.4.1_02" The site speaks of being forced to change their name from Wotbot -to- Wotbox.
I suppose any bot can be mainstreamed, but I'm thinking Java has some bad history?
In my .htaccess it reads:
RewriteCond %{HTTP_USER_AGENT} Java1 [NC,OR] Somehow that no longer seems adequate.
Thanks.
Pendanticist.
[edited by: jdMorgan at 1:57 pm (utc) on Nov. 7, 2003]
[edit reason] Neutered URLs [/edit]
Java is just a 'generic' user-agent - It's usually just the version of Java used to code the Web interface library used by various 'bots, some good and some bad. As a result, Java is one of those UAs that you have to be careful with -- in some cases allowing it if it comes from a 'good' IP address.
At least these guys are identifying themselves properly, and more thoroughly than half the big-name 'bots out there! I'll withold judgement as long as they obey robots.txt, although the double-fetch of your index.html within two seconds isn't exactly wonderful.
Jim