Specifically, you'll have to use the deny,allow directives, then define which pass and all others fail. Suggest starting with robots.txt and deny all bots then, also in robots.txt, allow only the ones you want. Then, after sufficient time to gather data on which bots do NOT honor robots.txt, add those to your deny (fail) entries in .htaccess.
Each webmaster will have a different set of criteria as to which "pass" and which "fail".
Some use rewrites, some use SetEnvIf ... The forum library, and the forum itself has many great examples to get started. In general we don't give out examples, we'd like to see your best effort code then address any errors. But, having said that, there ARE great examples of how to get started in the library.
Stop by UA, or stop by speed of access (behavior), or stop by IP range address, or by country/geo... The Magnificent Obsession of blocking bots and unwanted traffic is more than a hobby, for some it is a "Way of Life". :)
My stuff is pretty simple, example from .htaccess
SetEnvIfNoCase User-Agent "nutch" ban
My Order Deny,Allow looks for "nutch" in any UA string, regardless of case, and sends it a 403 via the env variable "ban"
Whether "nutch" is in UA strings or not is not the real question... Some "nutch" is okay, most is not, but each webmaster has to make a personal determination as to which is which, and that means studying access logs, looking at bandwidth, determining traffic benefits... a whole list of things.
Again, the LESS STRESSFUL way is to start with "who do I let in" instead of "who do I kick out".
Think about a party in your livingroom. It's easier to control the party by only INVITING people than attempting to kick out all the UNINVITED or RUDE, ROWDY, or just SPAMMY attendees. That's what whitelisting accomplishes. Once whitelisting is established the only ones kicked out after that are gate crashers... and that's a much smaller drain on time and energy.