Forum Moderators: phranque
It comes to my site every couple of days and uses different IPs to repeatedly hit free download links (which are disallowed in my robot.txt file but, it does not seem to check my robots.txt file). It also does not hit all the free download links, it seems to just focus on one. Each time it visits, it hits the same link 10 - 20 times. Then it comes back in a few days and hits the same link again 10-20 times. Each time it spreads the hits out by 40 seconds or so. This goes on and on and definitely would not be human behavior. I am thinking this pattern of changing IPs and 40 second delays is to disguise its malicious intent. In my log files, these instances are always uniquely identifiable with "KKman2.0" listed in the entry, see example log file below. I manually changed my site information to 'BlahBlah' in the logs just for posting purposes here. I am not real savvy in interpreting the log file so any general explanation of how to interpret the log file would be helpful as well.
Does anyone have any thoughts on what is happening here? Should I ban this entity? Should I do it by IP or User Agent? Would KKman2.0 be considered the UA? Any help is greatly appreciated and thank you in advance.
219.252.44.** - - [16/Jan/2010:15:58:13 -0800] "GET BlahBlah HTTP/1.0" 200 389030 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
67.202.42.*** - - [16/Jan/2010:16:01:06 -0800] "GET BlahBlah HTTP/1.0" 200 466247 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
212.233.221.** - - [16/Jan/2010:16:02:23 -0800] "GET BlahBlah HTTP/1.0" 200 334368 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
190.176.159.*** - - [16/Jan/2010:16:02:43 -0800] "GET BlahBlah HTTP/1.1" 200 245199 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
69.114.133.** - - [16/Jan/2010:16:03:40 -0800] "GET BlahBlah HTTP/1.0" 200 466247 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
190.176.159.*** - - [16/Jan/2010:16:05:23 -0800] "GET BlahBlah HTTP/1.0" 200 466247 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
62.60.136.** - - [16/Jan/2010:16:05:41 -0800] "GET BlahBlah HTTP/1.0" 200 456980 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
212.233.221.** - - [16/Jan/2010:16:05:43 -0800] "GET BlahBlah HTTP/1.0" 200 206212 "BlahBlah" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)"
[edited by: jdMorgan at 5:08 am (utc) on Jan. 19, 2010]
[edit reason] Obscured specific IP addresses. [/edit]
I tend to shoot first and ask questions later, so I'd block that user-agent "KKman/<anything>" and then watch its reaction. If it can't handle a 403 and keeps coming back, then consider rewriting any request from that UA to a zero-byte file instead of trying to block it. Then ask your host if their server firewalls can be set to block user-agents (I'd suspect that the majority cannot, since it requires actually examining the request headers and takes a lot more firewall 'effort' than firewalling an IP address or address range).
Discussion of bad-bots and user-agents is quite lively over in our "Search Engine Spider and User Agent Identification" forum, and you might wish to ask for opinions on this user-agent over there. Having decided whether you want to block it, the discussion of "how" is appropriate to this forum.
Jim
However, based on your description above, it's likely the whole string is a spoof.
Jim