homepage Welcome to WebmasterWorld Guest from 54.166.84.82
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Java/1.5.0_06
jake66

5+ Year Member



 
Msg#: 4331 posted 12:05 am on Jun 30, 2006 (gmt 0)

forgive me if this is in the correct area... but what the heck is this thing?

the following flavors currently pound my site for unknown reasons and i find little to no information about them. they are indexing / looking at the most peculiar of things too...
Java/1.5.0_06
java/1.4.1_04
java/1.4.2_06
java/1.5.0_06

i've tried to ban them site relentlessly but they always came back. so far the only successful method of ban that seems to work on them is
SetEnvIfNoCase User-Agent "java/1.4.1_04" bad_bot
SetEnvIfNoCase User-Agent "java/1.4.2_06" bad_bot
SetEnvIfNoCase User-Agent "java/1.5.0_06" bad_bot
SetEnvIfNoCase User-Agent "snapbot/1.0" bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot

i have also had reports from legitimate users that they now receive 403 forbidden errors, but i still get hits and people browsing the site fine (just as i do)... could the bits i posted above be keeping legit users from accessing my site?

 

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4331 posted 8:27 pm on Jul 1, 2006 (gmt 0)

Just block "java/" and skip the version numbers.

It's just what it says it is, it's something written in java accessing your site, such as an RSS reader if you have a blog, so in fact you could be stopping some RSS reader written in Java from working.

One one site I let my RSS Feed get accessed by all, it's just snippets and it's unprotected, but block anything from pulling pages from my website so they'll need to click to my site to read the rest of the page.

jake66

5+ Year Member



 
Msg#: 4331 posted 2:44 am on Jul 4, 2006 (gmt 0)

It's just what it says it is, it's something written in java accessing your site, such as an RSS reader if you have a blog, so in fact you could be stopping some RSS reader written in Java from working.

i don't have any of those. is there any way to determine who/what is grabbing my pages? all i have from them is their ip, which is different every time.

One one site I let my RSS Feed get accessed by all, it's just snippets and it's unprotected, but block anything from pulling pages from my website so they'll need to click to my site to read the rest of the page.

how could i do that as well?

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4331 posted 6:05 am on Jul 4, 2006 (gmt 0)

I'm confused, you do or don't have an RSS / XML feed or a blog?

JAB Creations

WebmasterWorld Senior Member jab_creations us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4331 posted 5:17 pm on Jul 4, 2006 (gmt 0)

This bot his my site every month and I've finally just blocked it. It crawls the same 30 files every time (and doesn't get anything) so I don't see the point. No url, no robots.txt, no files that would have email addresses. It's as dumb as a bot is going to get.

Just block it via your .htaccess or if you use a scripting language such as PHP use the following...

[webmasterworld.com...]

- John

Pfui

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4331 posted 6:34 pm on Jul 4, 2006 (gmt 0)

The "Java" User-agent (UA) can be from anywhere, by anyone, for anything. A lot of us bot-watchers (hanging out in the Search Engine Spider Identification [webmasterworld.com] forum) think anything Java-related is block-worthy for those reasons alone. FWIW --

Java/1.4.2_01 user agent
What is it?
[webmasterworld.com...]

Strange hits from random. bots?
Logs show multiple 20 hit sessions with java browser
[webmasterworld.com...]

(How to block it...)
How to ban (compatible ; type requests
Note space between compatible and semicolon
[webmasterworld.com...]

ncw164x

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4331 posted 8:12 pm on Jul 4, 2006 (gmt 0)

its a site ripper and has been around for quite a few years starting off as version 1.1#*$! and getting up to the current version of 1.5xxx

does not matter if its a blog or a site it will rip it and at lightening speed like 10 - 15 pages a second, I have had it banned for years

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved