I don't know which is worse, a bot that doesn't ask for robots.txt, or one that does and ignores its full Disallow. Here's yet another of the latter, one that's been abusing my sites for at least a couple of years:
-----
74.112.131.131
Mozilla/5.0 (compatible; Butterfly/1.0; +http
://labs
.topsy
.com/butterfly/) Gecko/2009032608 Firefox/3.0.8
08/19 nn:09:22 /robots.txt
-----
74.112.131.133
Mozilla/5.0 (compatible; Butterfly/1.0; +http
://labs
.topsy
.com/butterfly/) Gecko/2009032608 Firefox/3.0.8
08/19 n1:16:07 /robots.txt
08/19 n1:16:07 /dir/filename.html
08/19 n2:18:56 /robots.txt
08/19 n2:08:33 /dir/filename.html
Previously (May, 2009):
Topsy.com / labs.topsy.com / butterfly.topsy.com
Butterfly/1.0 *and* libwww-perl
[
webmasterworld.com...]