Gorufu, littleman, Air, SugarKane? You guys see any errors or better ways to do this....anybody got a bot to add....before I stick this in every site I manage.
Feel free to use this on your own site and start blocking bots too.
(the top part is left out)<Files .htaccess>
deny from all
</Files>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]
RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]
what alternative is there if my server doesn't have mod_rewrite installed?
You could use mod_access and mod_setenvif which are compiled and loaded into the server by default. They should be available unless you or your hosting company removed them.
Deny [httpd.apache.org] is used to restrict access to the server based on hostname, IP address, or environment variables. Hostname and IP won´t work, so we need a way to set environment variables depending on the User-Agent. SetEnvIf [httpd.apache.org] allows us to do just that. Preferrably we would like the matching to be case insensitive. Luckily the Apache developers provided a method to do just that SetEnvIfNoCase [httpd.apache.org].
Now we need to put those pieces together.
SetEnvIfNoCase User-Agent EmailSiphon AC_FORBIDDEN
SetEnvIfNoCase User-Agent EmailWolf AC_FORBIDDEN
SetEnvIfNoCase User-Agent Crescent AC_FORBIDDEN
SetEnvIfNoCase User-Agent LinkWalker AC_FORBIDDEN
SetEnvIfNoCase User-Agent EmailCollector AC_FORBIDDEN
Order Allow,Deny
Allow from all
Deny from env=AC_FORBIDDEN
As with the regular expression in the RewriteCond directive you could just use one SetEnvIfNoCase [httpd.apache.org] like this:
SetEnvIfNoCase User-Agent EmailSiphon¦EmailWolf¦Crescent¦LinkWalker¦EmailCollector AC_FORBIDDEN
Order Allow,Deny
Allow from all
Deny from env=AC_FORBIDDEN
where everything from SetEnvIfNoCase to AC_FORBIDDEN would need to be in a single line.
Andreas
Can anyone tell me what "RewriteCond: bad flag delimiters" means (other than the obvious)? As soon as I plug in the following to my .htaccess, I'm getting 500 errors, and "RewriteCond: bad flag delimiters" shows up in the error_log.
RewriteCond %{HTTP_USER_AGENT} ^Mozilla* [OR]
RewriteCond %{HTTP_USER_agent} .*almaden.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_agent} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_agent} ^attach [OR]
RewriteCond %{HTTP_USER_agent} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_agent} ^BackWeb [OR]
RewriteCond %{HTTP_USER_agent} ^Bandit [OR]
RewriteCond %{HTTP_USER_agent} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_agent} ^Buddy [OR]
RewriteCond %{HTTP_USER_agent} ^bumblebee [OR]
RewriteCond %{HTTP_USER_agent} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_agent} ^CICC [OR]
RewriteCond %{HTTP_USER_agent} ^Collector [OR]
RewriteCond %{HTTP_USER_agent} ^Copier [OR]
RewriteCond %{HTTP_USER_agent} ^Crescent [OR]
RewriteCond %{HTTP_USER_agent} ^DA [OR]
RewriteCond %{HTTP_USER_agent} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_agent} ^DISCo\Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_agent} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_agent} ^Downloader [OR]
RewriteCond %{HTTP_USER_agent} ^Drip [OR]
RewriteCond %{HTTP_USER_agent} ^DSurf15a [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_agent} ^EasyDL/2.99 [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_agent} ^EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_agent} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_agent} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_agent} ^GetSmart [OR]
RewriteCond %{HTTP_USER_agent} ^gigabaz [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go\!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_agent} ^gotit [OR]
RewriteCond %{HTTP_USER_agent} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_agent} ^grub-client [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_agent} ^httpdown [OR]
RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
RewriteCond %{HTTP_USER_agent} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_agent} ^Indy*Library [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_agent} ^InternetLinkagent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_agent} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_agent} ^Iria [OR]
RewriteCond %{HTTP_USER_agent} ^JBH*agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_agent} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_agent} ^LexiBot [OR]
RewriteCond %{HTTP_USER_agent} ^lftp [OR]
RewriteCond %{HTTP_USER_agent} ^Link*Sleuth [OR]
RewriteCond %{HTTP_USER_agent} ^likse [OR]
RewriteCond %{HTTP_USER_agent} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_agent} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_agent} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_agent} ^Memo [OR]
RewriteCond %{HTTP_USER_agent} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_agent} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_agent} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_agent} ^Mozilla*MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MS\ FrontPage* [OR]
RewriteCond %{HTTP_USER_agent} ^MSIECrawler [OR]
RewriteCond %{HTTP_USER_agent} ^MSProxy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_agent} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_agent} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_agent} ^Openfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_agent} ^Ping [OR]
RewriteCond %{HTTP_USER_agent} ^PingALink [OR]
RewriteCond %{HTTP_USER_agent} ^Pockey [OR]
RewriteCond %{HTTP_USER_agent} ^psbot [OR]
RewriteCond %{HTTP_USER_agent} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_agent} ^Reaper [OR]
RewriteCond %{HTTP_USER_agent} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
RewriteCond %{HTTP_USER_agent} ^Seeker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_agent} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_agent} ^SlySearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_agent} ^Snake [OR]
RewriteCond %{HTTP_USER_agent} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_agent} ^Stripper [OR]
RewriteCond %{HTTP_USER_agent} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_agent} ^Szukacz [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_agent} ^URLSpiderPro [OR]
RewriteCond %{HTTP_USER_agent} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_agent} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web Downloader [OR]
RewriteCond %{HTTP_USER_agent} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_agent} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_agent} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_agent} ^x-Tractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_agent} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.*$ /robots.php [L]
So I changed it to:
RewriteCond %{HTTP_USER_AGENT} almaden [OR]
I also determined that some of the problem was with:
RewriteCond %{HTTP_USER_AGENT} ^Web Downloader [OR]
It didn't escape the space.
This appears to have resolved the problems.