Forum Moderators: DixonJones

Message Too Old, No Replies

Useragent: Java/1.5.0_05

See this often in my logfile, what is it?

         

LunaC

5:20 pm on Nov 16, 2005 (gmt 0)

10+ Year Member



I've been getting hit with this many times a day, and it's behavior worries me, is it a scraper, anything I should worry about?

It arrives each time with a different IP, grabs all pages on my site, and never looks at robots.txt, images or .css. Never does it show a referrer either.

Then later another one arrives from a different IP and it starts all over again.

So, what is it? Should I block this thing with .htaccess or am I just getting paranoid?

nancyb

7:00 pm on Nov 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This search [google.com] on WebmasterWorld lists several posts that might be of interest.

This one [webmasterworld.com] and this one [webmasterworld.com] from that list look most interesting.

jdMorgan

9:06 pm on Nov 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> It arrives each time with a different IP, grabs all pages on my site, and never looks at robots.txt, images or .css. Never does it show a referrer either.

So it's a scraper.

Actually, it's a library of functions that can be used for any purpose, but by your description, it's being used as a site downloader, a.k.a scraper. Your site is not likely to suffer if you block it.

Jim

LunaC

2:06 am on Nov 17, 2005 (gmt 0)

10+ Year Member



Thank you again.