Gorufu, littleman, Air, SugarKane? You guys see any errors or better ways to do this....anybody got a bot to add....before I stick this in every site I manage.
Feel free to use this on your own site and start blocking bots too.
(the top part is left out)<Files .htaccess>
deny from all
</Files>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]
RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]
Sticky
Couldn't say I notice any at all. The part above this though could determine that...if I run everything through the php parser I expect a hit. Usually I run AddHandlers for for ssi's and have never noticed a slow down.
BTW I pieced this together from snippets others posted here on the board.
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
Not sure what the difference is but this one is the one that comes by every fifteen minutes as my competition tries to fool me into thinking I have more traffic than I do. Now it's easily filtered as a 403.
Long live mod_rewrite :)
I've been going back and forth from a kind of banbot.cgi that reads a banned.txt file, to just drawing a line in the sand and doing the full-on mod_rewrite at the top level to initiate a trickle down effect on the sub domains I host.
What I've been toying with is a combination of my banned.txt file automatically updating my .htaccess file - using grep to insert/add/delete lines depending on what is in banned.txt. It's pretty easy to update my banned.txt file either by hand or with a little interface program I wrote - but I'm 'grappling with grep' to insert my lines in the correct place in the .htaccess file. I'm in the dark with grep. Grep vexes me. Grep makes my stomach hurt.
Has anyone else considered this, or is it too much work? I thought it would give me some flexibility, and kill two birds with one stone. In fact, at 2 am I think it's a brilliant idea. Then again, I don't get out much.