Forum Moderators: open

Message Too Old, No Replies

Checkbot/1.71

New bot? I couldn't find it in WW site search

         

berli

2:20 am on Jun 13, 2003 (gmt 0)

10+ Year Member



066-241-084-164.bus.ashlandfiber.net - - [12/Jun/2003:22:05:51 -0400] "GET /foo/ HTTP/1.1" 200 14078 "-" "Checkbot/1.71 LWP/5.64"
066-241-084-164.bus.ashlandfiber.net - - [12/Jun/2003:22:05:52 -0400] "GET /foo/ HTTP/1.0" 200 14042 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
066-241-084-164.bus.ashlandfiber.net - - [12/Jun/2003:22:11:41 -0400] "GET /foo/ HTTP/1.1" 200 15638 "-" "Checkbot/1.71 LWP/5.64"

I found the above in my logfile. What is Checkbot, and why is it (apparently) used with a human u_a?

wilderness

3:14 am on Jun 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



berli,
I didn't look for check.
LWP is an old thing that many have denied.
Used to be LWP Trivial or some such thing.

jdMorgan

4:37 am on Jun 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



berli,
This one showed up here today, in two sessions.

First, it came in on .168 using HTTP/1.1, and got blocked by user-agent, since this was on my "watch list." as a result of your post:

66.241.84.168 - - [14/Jun/2003:06:21:34 -0400] "GET / HTTP/1.1" 403 775 "-" "Checkbot/1.71 LWP/5.64"

Then it switched to "Mozilla/4.0", and switched to HTTP/1.0:
66.241.84.168 - - [14/Jun/2003:06:21:35 -0400] "GET / HTTP/1.0" 200 9216 "-" "Mozilla/4.0"
66.241.84.168 - - [14/Jun/2003:06:21:36 -0400] "GET / HTTP/1.0" 301 226 "-" "Mozilla/4.0"

Later, it came back on a different IP address and tried the same sequence, again changing to a different (genuine, this time) user-agent on the retry, but got blocked both times because my .htaccess was updated as a result of the first visit. So it then fetched my 403 explanation page twice, which is allowed to all agents, good or bad. (My main 403 page is very short, containing only a "Access Forbidden - click here for more info" text link and a meta-refresh after a few seconds. This keeps "junk" bandwidth down, but real people who may have been blocked unintentionally can click through for more info. It looks like this 'bot followed both the text link and the meta-refresh.)

66.241.84.164 - - [14/Jun/2003:22:23:20 -0400] "GET / HTTP/1.1" 403 775 "-" "Checkbot/1.71 LWP/5.64"
66.241.84.164 - - [14/Jun/2003:22:23:26 -0400] "GET / HTTP/1.0" 403 756 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
66.241.84.164 - - [14/Jun/2003:22:23:26 -0400] "GET / HTTP/1.0" 403 234 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
66.241.84.164 - - [14/Jun/2003:22:23:27 -0400] "GET /403explain.html HTTP/1.0" 200 3713 "http://www.example.org" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
66.241.84.164 - - [14/Jun/2003:22:23:27 -0400] "GET /403explain.html HTTP/1.0" 301 235 "http://www.example.org" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"

I still don't know who this pest 'bot is, but it morphs too much, and so is not welcome until I find out.

Jim

HandwovenRug

5:01 am on Jun 15, 2003 (gmt 0)

10+ Year Member



It's Hans de Graaff's Checkbot, a link validator.
[degraaff.org...]

Key_Master

5:16 am on Jun 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



WebRing robot, my guess email-harvester. Masquerades under many User-agents:

http*://66.241.84.164/rw

jdMorgan

5:20 am on Jun 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmmm...

WebRing used to use Jonzilla, IIRC.

Jim

berli

11:19 pm on Jun 16, 2003 (gmt 0)

10+ Year Member



I would say Webring probably has something to do with it. I got a hit from Webring just about the same time, except that in that case it was identified as such.

Why would Webring need to check a page FOUR times (within seconds of each other) with different IP's and u_a's?

Seems fishy to me.

wkitty42

12:35 am on Jun 17, 2003 (gmt 0)

10+ Year Member



key_master,

i'm on a webring or two and checkbot hit my site a few times last month and so far, only twice this month... i suspect that they are trying to validate valid webrings and webring sites... as long as they don't expect me to run their most up to date code, i have no problems with them... my webring links operate just fine without their advertising... and as an advertising free site, that's the way i like it... time to go snooping and make sure that my webring stuff is still operational OB-)

Key_Master

1:37 am on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wouldn't touch a web ring with a ten foot pole, which makes me wonder why it hits sites like mine that aren't web ring related. Not to mention changing User-agents and IP's.