I use APF on RHEL...
there are others.
APF is donationware, loads of people at ev1 use it, and in my experience it works well and can be extended easily. It has a whitelist capability for the good guys. BFD is an associated shell script that might also be helpful.
I feed apf with IPs from a script on one of my sites that is susceptible to rippers. More than x pages in a minute and they're blocked server wide.
Its dead easy to use a honeypot url and feed the ips to apf. Use robots.txt to block google et al from /honeydir/honeypage.html and you will have the rogue bots bang to rights without lifting a finger. Have a script auto generate new honeypot url and update robotstxt to keep badbots guessing.
I use apf with IPs from a geo IP database to ban port 25 from various countries. Works brilliantly for email spam reduction, and took a big load off scanning for spam.
Have a script release the IP's after a period (unless from the honeypot). A short period is OK because the next time they hit the trigger they get blocked right off with little o/head to you. You can increment the block period if you have the script check a db of last block per ip. 3 strikes and you're out.
You can have a script scan your logs for suspicious behaviour falling in between 'normal' but below the 'block' threshold and email you a report so you can manually add selected ips to apf from an admin page or the c/l. (BFD will do this)
htaccess and httpd.conf are too slow if the list is long. My firewall block lists get to be a few k ips and I cant see the difference on server load or script time if I turn the firewall off so its not a big issue - but I dont have your numbers of visitors :-(
I've had this stuff running on my server, but I'm not any kind of expert and for sure there are smart people here that can sort it for you or maybe talk to Ryan who wrote APF
Whatever, please... there *has* to be a better solution than banning google. So here's to you finding it cos I need the search function back, and despite some of the gungho optimism I dont really believe WebmasterWorld can live without g.