I was wondering if anyone had any suggestions (or a list) of bots that provide little to no value to the web site that it trolls. For example, the TurnItinBot. It will archive the content of your site, but only to allow teachers to compare student papers to see if kids are writing their own essays or copying from the web. A great service for teachers, but the indexing doesn't seem to do anything to promote the site it trolls.
A good place to start is the robots.txt here at WebmasterWorld [webmasterworld.com]. However, it is important not to just blindly copy that list: you need to evaluate the usefullness of each bot in the context of your own site. Also, that list doesn't include bots which don't obey robots.txt, which you will need to ban by IP or user agent string.