Forum Moderators: open
Today, I got a visit by Cowbot-0.1 who fully identified himself as coming from NHN Corp. According to ZinguGuy here [webmasterworld.com] this company runs the most popular search engine in Korea. The bot checked robots.txt first and then indexed 60 pages at a reasonable rate of one per 6-8 seconds and never went into disallowed areas.
218.145.25.45 - - [29/Oct/2003:19:40:00 -0600] "GET /robots.txt HTTP/1.0" 200 1189 "-" "Cowbot-0.1 (NHN Corp. / 2-3011-1954 / nhnbot@naver.com)"
People mentioned before that the other UA's probably came from this same company. My guess is perhaps they were beta versions being tested out, and the company is now using its production model. Microsoft's beta version bot hasn't been too swift or well-identified either. Or, the others got banned by so many sites that NHN is now giving contact info for people to report their bot if it misbehaves. Obviously, each person has to decide for himself whether to give this bot another chance or not. For me, I'll continue to allow them on as long as he behaves.
('naver' got thru my trap somehow.)
This one...
220.73.165.78 - - [03/Nov/2003:14:44:39 -0800] "GET /robots.txt HTTP/1.0" 200 1524 "-" "Cowbot-0.1 (NHN Corp. / +82-2-3011-1954 / nhnbot@naver.com)"
220.73.165.78 - - [03/Nov/2003:14:44:39 -0800] "GET / HTTP/1.0" 403 480 "-" "Cowbot-0.1 (NHN Corp. / +82-2-3011-1954 / nhnbot@naver.com)"
...brought this one to the site and seems to have waited for him/her to be successful, before leaving. Check out the time stamp.
61.78.61.166 - - [03/Nov/2003:14:44:41 -0800] "GET /OLD-Blahblah.html HTTP/1.1" 404 2847 "-" "Cowbot-0.1 (NHN Corp. / +82-2-3011-1954 / nhnbot@naver.com)"
61.78.61.166 - - [03/Nov/2003:14:44:43 -0800] "GET /Blahblah.html HTTP/1.1" 200 13835 "-" "Cowbot-0.1 (NHN Corp. / +82-2-3011-1954 / nhnbot@naver.com)" And then 'Grasshopper' went thru my site a bit on the fast side.
Nothing like having your site used as a teaching aide, eh!?!
<chuckle/sigh>
Pendanticist.
Are you missing any OR's in your list? I did that once, and it took awhile to figure out what the problem was. Since then, I wrote a script to send fake UA's and referers so I can test my htaccess out after making any changes and before uploading it on a production site.