Forum Moderators: open
I found the above in my logfile. What is Checkbot, and why is it (apparently) used with a human u_a?
First, it came in on .168 using HTTP/1.1, and got blocked by user-agent, since this was on my "watch list." as a result of your post:
66.241.84.168 - - [14/Jun/2003:06:21:34 -0400] "GET / HTTP/1.1" 403 775 "-" "Checkbot/1.71 LWP/5.64"
Then it switched to "Mozilla/4.0", and switched to HTTP/1.0:
66.241.84.168 - - [14/Jun/2003:06:21:35 -0400] "GET / HTTP/1.0" 200 9216 "-" "Mozilla/4.0"
66.241.84.168 - - [14/Jun/2003:06:21:36 -0400] "GET / HTTP/1.0" 301 226 "-" "Mozilla/4.0"
Later, it came back on a different IP address and tried the same sequence, again changing to a different (genuine, this time) user-agent on the retry, but got blocked both times because my .htaccess was updated as a result of the first visit. So it then fetched my 403 explanation page twice, which is allowed to all agents, good or bad. (My main 403 page is very short, containing only a "Access Forbidden - click here for more info" text link and a meta-refresh after a few seconds. This keeps "junk" bandwidth down, but real people who may have been blocked unintentionally can click through for more info. It looks like this 'bot followed both the text link and the meta-refresh.)
66.241.84.164 - - [14/Jun/2003:22:23:20 -0400] "GET / HTTP/1.1" 403 775 "-" "Checkbot/1.71 LWP/5.64"
66.241.84.164 - - [14/Jun/2003:22:23:26 -0400] "GET / HTTP/1.0" 403 756 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
66.241.84.164 - - [14/Jun/2003:22:23:26 -0400] "GET / HTTP/1.0" 403 234 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
66.241.84.164 - - [14/Jun/2003:22:23:27 -0400] "GET /403explain.html HTTP/1.0" 200 3713 "http://www.example.org" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
66.241.84.164 - - [14/Jun/2003:22:23:27 -0400] "GET /403explain.html HTTP/1.0" 301 235 "http://www.example.org" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; yie6)"
I still don't know who this pest 'bot is, but it morphs too much, and so is not welcome until I find out.
Jim
i'm on a webring or two and checkbot hit my site a few times last month and so far, only twice this month... i suspect that they are trying to validate valid webrings and webring sites... as long as they don't expect me to run their most up to date code, i have no problems with them... my webring links operate just fine without their advertising... and as an advertising free site, that's the way i like it... time to go snooping and make sure that my webring stuff is still operational OB-)