Forum Moderators: phranque
I want to use htaccess to ban by user agent certain rogues who use fairly distinctive browsers. I not sure what the exact sytnax is for different agents. I'm not sure which fields, for example require to be "escaped". I have a couple of example from by log below. If someone can show the syntax needed it would be helpful. Also is it necessary to have all the agent listed? Could for instance anyone using Opera browser of any type be banned?
"Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)"
[could this agent be banned just by Gecko/20030624 Netscape/7.1 (ax)]?
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
What would be the syntax to use on this for instance?
RewriteEngine On
RewriteCond %(HTTP_USER_AGENT) ^Mozilla/5.0 Windows; U; Win 9x 4.90; en-US; rv:1.4 Gecko/200030624 Netscape/7.1 (ax)
RewriteRule ^.*$ - [F]
Thank for the great resource
Many of the characters in your user-agent string have special meanings to the regular-expressions parser used by mod_rewrite. Therefore, they must be escaped by preceding them with a backslash:
^Mozilla/5\.0\ Windows;\ U;\ Win\ 9x\ 4\.90;\ en-US;\ rv:1\.4\ Gecko/200030624\ Netscape/7\.1\ \(ax\)$
Jim
As posted above, using both a start and end anchor ("^" and "$") and omitting the [NC] (no case) flag, the match would have to be exact letter-per-letter, and the case of each letter would have to match.
I'd suggest you block by user-agent *and* IP address range, in order to minimize collateral damage, since Netscape 7 is one of the top second-tier browsers. In other words, please don't block me from your site, just because I'm browsing using Netscape tomorrow!
You can add RewriteConds to limit the damage to certain class A, B, or C IP address ranges; Make them as wide a range as necessary, but no wider. Also, even with an IP restriction to limit collateral damage, be aware that your *are* likely to whack a few innocent bystanders, so I'd recommend "being polite" on whatever page they end up at.
# limit rule to 256 IP addresses starting at 192.168.0.0
RewriteCond %{REMOTE_ADDR} ^192\.168\.0\. [OR]
# limit rule to 65536 IP addresses beginning at 90.0.0.0
RewriteCond %{REMOTE_ADDR} ^90\.0\. [OR]
# limit rule to 16,777,216 IP addresses beginning at 10.0.0.0
RewriteCond %{REMOTE_ADDR} ^10\.
# Note No [OR] on previous RewriteCond, so this is an "AND"
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5\.0\ Windows;\ U;\ Win\ 9x\ 4\.90;\ en-US;\ rv:1\.4\ Gecko/200030624\ Netscape/7\.1\ \(ax\)$
RewriteRule .* - [F]
Jim
SetEnvIf User-Agent "Indy.Library" getout
SetEnvIf Request_URI "^(/403.*\.html¦/robots\.txt)$" public
<Files *>
Order Deny,Allow
Deny from env=getout
Allow from public
</Files>
The first line sets a variable called "getout" if the user-agent is Indy
The second line sets a variable called "public" if the request is for a custom error page called "403.html" or for robots.txt, both of which should be universally-accessible.
The remaining code section blocks Indy unless it is requesting either of those two files.
The names of the variables are arbitrary, as is the name of the custom error page -- use any names you like, as long as they're consistent.
Be aware that "User-Agent" is hyphenated and "Request_URI" uses an underscore. This is how they are shown in the Apache SetEnvIf documentation, and I've never had the time or inclination to experiment to see if it mattered.
Change all broken pipe "¦" characters above to solid pipes before use.
If this method isn't allowed on your server, it may be time for a better host...
Actually, you didn't say how you "knew" that mod_rewrite wasn't enabled, so I took your word for it. I should point out that you may need to precede your mod_rewrite code with:
Options +FollowSymLinks
RewriteEngine on
Jim
I can only assume the mod_rewrite is not enabled because the script I wrote is not working. No one will respond to my e-mails asking about it so I don't really know for sure. In the meantime I will try your latest suggestion and at the same time shop around for another host that better meets my needs. Below is part of the script I'm using. Maybe its screwed up which is why It's not working. Hard to tell since the dopes that run the server won't talk to me.
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} snykeBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Scooter [NC,OR]
RewriteCond %{HTTP_USER_AGENT} slurp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC] [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.*$ [****.com...] [L]
RewriteCond %{HTTP_USER_AGENT} Indy.Library [b][NC] [OR][/b]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.*$ [b]http://****.com/_.htm[/b] [L]
RewriteCond %{HTTP_USER_AGENT} Indy.Library [b][NC,OR][/b]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule .* [b]/_.htm[/b] [L]
RewriteRule .* - [F] Jim