i prefer to allow most bots on my sites. it may cost me some bandwidth but i prefer to focus my time on making money and not limiting access to bots. when i do want to block a bot i use robots.txt (if they are a nice bot) and also i use htaccss (if they are a bad bot).
please remember that robots.txt is a voluntary protocol. robots.txt will not stop bots that are broken or intentionally programmed to ignore robots.txt. if you want to protect data you should use htaccess or a similar alternative.
I was just checking into haidu or is it haldu (must change fonts on browser) and gigabot. google led me here. got the registration page and figured I should at least identify myself as a friendly real human.
I might fire up my website again, had no success with it for a year and a whole lot of work deleting messages to my Blog, all of them from a .info yadda yadda you prob'ly know the rest.
The same ones which invaded yahoo groups for the past couple of years.
Of course this is one of my areas of interest, so I'll probably be checking back. Unless my new site keeps me too busy making cash, that is. Thanks for putting up with me.