Welcome to WebmasterWorld Guest from 54.166.77.32

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

Java/1.5.0_06

     
12:05 am on Jun 30, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 2, 2005
posts:505
votes: 0


forgive me if this is in the correct area... but what the heck is this thing?

the following flavors currently pound my site for unknown reasons and i find little to no information about them. they are indexing / looking at the most peculiar of things too...
Java/1.5.0_06
java/1.4.1_04
java/1.4.2_06
java/1.5.0_06

i've tried to ban them site relentlessly but they always came back. so far the only successful method of ban that seems to work on them is

SetEnvIfNoCase User-Agent "java/1.4.1_04" bad_bot
SetEnvIfNoCase User-Agent "java/1.4.2_06" bad_bot
SetEnvIfNoCase User-Agent "java/1.5.0_06" bad_bot
SetEnvIfNoCase User-Agent "snapbot/1.0" bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot

i have also had reports from legitimate users that they now receive 403 forbidden errors, but i still get hits and people browsing the site fine (just as i do)... could the bits i posted above be keeping legit users from accessing my site?

8:27 pm on July 1, 2006 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14663
votes: 99


Just block "java/" and skip the version numbers.

It's just what it says it is, it's something written in java accessing your site, such as an RSS reader if you have a blog, so in fact you could be stopping some RSS reader written in Java from working.

One one site I let my RSS Feed get accessed by all, it's just snippets and it's unprotected, but block anything from pulling pages from my website so they'll need to click to my site to read the rest of the page.

2:44 am on July 4, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 2, 2005
posts:505
votes: 0


It's just what it says it is, it's something written in java accessing your site, such as an RSS reader if you have a blog, so in fact you could be stopping some RSS reader written in Java from working.

i don't have any of those. is there any way to determine who/what is grabbing my pages? all i have from them is their ip, which is different every time.

One one site I let my RSS Feed get accessed by all, it's just snippets and it's unprotected, but block anything from pulling pages from my website so they'll need to click to my site to read the rest of the page.

how could i do that as well?
6:05 am on July 4, 2006 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14663
votes: 99


I'm confused, you do or don't have an RSS / XML feed or a blog?
5:17 pm on July 4, 2006 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member jab_creations is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 26, 2004
posts:3168
votes: 22


This bot his my site every month and I've finally just blocked it. It crawls the same 30 files every time (and doesn't get anything) so I don't see the point. No url, no robots.txt, no files that would have email addresses. It's as dumb as a bot is going to get.

Just block it via your .htaccess or if you use a scripting language such as PHP use the following...

[webmasterworld.com...]

- John

6:34 pm on July 4, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts:2038
votes: 1


The "Java" User-agent (UA) can be from anywhere, by anyone, for anything. A lot of us bot-watchers (hanging out in the Search Engine Spider Identification [webmasterworld.com] forum) think anything Java-related is block-worthy for those reasons alone. FWIW --

Java/1.4.2_01 user agent
What is it?
[webmasterworld.com...]

Strange hits from random. bots?
Logs show multiple 20 hit sessions with java browser
[webmasterworld.com...]

(How to block it...)
How to ban (compatible ; type requests
Note space between compatible and semicolon
[webmasterworld.com...]

8:12 pm on July 4, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 7, 2003
posts:1179
votes: 0


its a site ripper and has been around for quite a few years starting off as version 1.1#*$! and getting up to the current version of 1.5xxx

does not matter if its a blog or a site it will rip it and at lightening speed like 10 - 15 pages a second, I have had it banned for years

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members