Welcome to WebmasterWorld Guest from 54.146.248.111

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

Java/1.5.0_06

     

jake66

12:05 am on Jun 30, 2006 (gmt 0)

5+ Year Member



forgive me if this is in the correct area... but what the heck is this thing?

the following flavors currently pound my site for unknown reasons and i find little to no information about them. they are indexing / looking at the most peculiar of things too...
Java/1.5.0_06
java/1.4.1_04
java/1.4.2_06
java/1.5.0_06

i've tried to ban them site relentlessly but they always came back. so far the only successful method of ban that seems to work on them is

SetEnvIfNoCase User-Agent "java/1.4.1_04" bad_bot
SetEnvIfNoCase User-Agent "java/1.4.2_06" bad_bot
SetEnvIfNoCase User-Agent "java/1.5.0_06" bad_bot
SetEnvIfNoCase User-Agent "snapbot/1.0" bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot

i have also had reports from legitimate users that they now receive 403 forbidden errors, but i still get hits and people browsing the site fine (just as i do)... could the bits i posted above be keeping legit users from accessing my site?

incrediBILL

8:27 pm on Jul 1, 2006 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Just block "java/" and skip the version numbers.

It's just what it says it is, it's something written in java accessing your site, such as an RSS reader if you have a blog, so in fact you could be stopping some RSS reader written in Java from working.

One one site I let my RSS Feed get accessed by all, it's just snippets and it's unprotected, but block anything from pulling pages from my website so they'll need to click to my site to read the rest of the page.

jake66

2:44 am on Jul 4, 2006 (gmt 0)

5+ Year Member



It's just what it says it is, it's something written in java accessing your site, such as an RSS reader if you have a blog, so in fact you could be stopping some RSS reader written in Java from working.

i don't have any of those. is there any way to determine who/what is grabbing my pages? all i have from them is their ip, which is different every time.

One one site I let my RSS Feed get accessed by all, it's just snippets and it's unprotected, but block anything from pulling pages from my website so they'll need to click to my site to read the rest of the page.

how could i do that as well?

incrediBILL

6:05 am on Jul 4, 2006 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'm confused, you do or don't have an RSS / XML feed or a blog?

JAB Creations

5:17 pm on Jul 4, 2006 (gmt 0)

WebmasterWorld Senior Member jab_creations is a WebmasterWorld Top Contributor of All Time 10+ Year Member



This bot his my site every month and I've finally just blocked it. It crawls the same 30 files every time (and doesn't get anything) so I don't see the point. No url, no robots.txt, no files that would have email addresses. It's as dumb as a bot is going to get.

Just block it via your .htaccess or if you use a scripting language such as PHP use the following...

[webmasterworld.com...]

- John

Pfui

6:34 pm on Jul 4, 2006 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



The "Java" User-agent (UA) can be from anywhere, by anyone, for anything. A lot of us bot-watchers (hanging out in the Search Engine Spider Identification [webmasterworld.com] forum) think anything Java-related is block-worthy for those reasons alone. FWIW --

Java/1.4.2_01 user agent
What is it?
[webmasterworld.com...]

Strange hits from random. bots?
Logs show multiple 20 hit sessions with java browser
[webmasterworld.com...]

(How to block it...)
How to ban (compatible ; type requests
Note space between compatible and semicolon
[webmasterworld.com...]

ncw164x

8:12 pm on Jul 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



its a site ripper and has been around for quite a few years starting off as version 1.1#*$! and getting up to the current version of 1.5xxx

does not matter if its a blog or a site it will rip it and at lightening speed like 10 - 15 pages a second, I have had it banned for years

 

Featured Threads

Hot Threads This Week

Hot Threads This Month