incrediBILL - 11:39 pm on Jan 25, 2014 (gmt 0)
The answer of course is YES - block away!
Use a whitelisted robots.txt so you can tell all the rest that honor robots.txt to nicely go away.
Beyond that, you have to play rough.
These bots are like burglars, you can put locks on the doors to try to keep them out until they find a new way in and steal all your stuff.
The only way I've found to potentially discourage them is to cloak evil nasty pages of vile content that is just delivered to those bots and is created using every wrong thing you could possibly do to intentionally screw up SEO, AdSense, etc.. For instance AdSense stop words, links to bad neighborhoods, keyword stuffing, profane language, tons of bad links, just all sorts of fun that if unfiltered would trash their site.
Basically, you have to do some really bad stuff to even get their attention and even then some of them don't care.
Most importantly, include details about their crawler UA, IP address, etc. in the cloaked content so when you find it you know exactly where it came from.
Technically you've done nothing wrong because robots.txt told them to stay away so if they picked up pages that caused them harm it's their own fault because they were told to stay out and ignored the warning.
That's how people get shot when they ignore the NO TRESPASSING signs out in the rural areas where I grew up.
Same basic principle.
Just beware as amateurs playing with this stuff can inadvertently link things back to your site and your efforts to mess with the scrapers can backfire and you end up with a bunch of junk you have to disavow in Google. Not recommended for the novice.