Forum Moderators: open

Message Too Old, No Replies

Butterfly

         

Pfui

11:27 pm on Aug 19, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't know which is worse, a bot that doesn't ask for robots.txt, or one that does and ignores its full Disallow. Here's yet another of the latter, one that's been abusing my sites for at least a couple of years:

-----
74.112.131.131
Mozilla/5.0 (compatible; Butterfly/1.0; +http://labs.topsy.com/butterfly/) Gecko/2009032608 Firefox/3.0.8

08/19 nn:09:22 /robots.txt

-----
74.112.131.133
Mozilla/5.0 (compatible; Butterfly/1.0; +http://labs.topsy.com/butterfly/) Gecko/2009032608 Firefox/3.0.8

08/19 n1:16:07 /robots.txt
08/19 n1:16:07 /dir/filename.html
08/19 n2:18:56 /robots.txt
08/19 n2:08:33 /dir/filename.html

Previously (May, 2009):

Topsy.com / labs.topsy.com / butterfly.topsy.com
Butterfly/1.0 *and* libwww-perl
[webmasterworld.com...]

dstiles

7:30 pm on Aug 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One that does not ask for robots.txt is better - it saves you a bit of bandwidth. :)

I banned topsy about 18 months ago and labelled it as a "twitter thingy"; I no longer recall what that exactly means but presumably a parasite.