Forum Moderators: DixonJones

Message Too Old, No Replies

Jakarta HTTP Client?

         

keyplyr

1:26 am on Jul 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




Been getting a dozen hits on my index page for several days now. What is the Jakarta HTTP Client used for?
This page doesn't tell me much: [jakarta.apache.org...]

**.***.***.* - - [22/Jul/2004:17:28:38 -0700] "GET / HTTP/1.1" 200 10616 "-" "Jakarta HTTP Client/1.0"

jdMorgan

2:20 am on Jul 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



keyplyr,

This could be anything... What does it *do* on your site? Does it fetch the images with your pages, or fetch the pages first and come back for images and other resources later -- or never? Case one is a browser behaviour, and case two is a 'bot behaviour.

As the Apache Jakarta Project page implies, you could build either a browser or a robot around this library, so it's probably best to look at it behaviourally and decide. I'm "Mr. 403" these days on unknown agents, but take a look at the behaviour pattern to decide.

Jim

keyplyr

4:05 am on Jul 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well that's just it, for several days it is just using Get to request my index page 4 or 5 times in a row.... then hits again the next day, same thing.

But what else is new? I've got a disallowed Polish IP who has been getting 403'd daily for the same file for over a year!

dcrombie

4:35 am on Aug 3, 2004 (gmt 0)



It's a Java software component - can be used by anyone for anything. You should block this one by IP address instead of user-agent if it's doing anything suspicious.

keyplyr

8:28 am on Aug 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks dcrombie

Blocking for a week didn't seem to discourage them. Maybe I'll just let 'em get what they want ;)