Forum Moderators: phranque
----
SetEnvIf Request_URI "(403\.php¦robots\.txt)$" allowit
Order Deny,Allow
deny from 63.****.xx.xx
Allow from env=allowit
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.*$ [whateva.htm...] [L]
----
questions:
- IPs (order deny,allow) are actually sent to my 403,
is it better to send bad bots RewriteRule outside my site (e.g.:whateva.htm) as above or to send to 403?
- will this merging of htaccess and mod_rewrite causes heavy load on my server CPU? (i have lot of IPs and a long list of bots).
- do you see any mistake on my mod_rewrite part?
thanks so much
tito
This is the exact order and method I use on several sites:
SetEnvIf Request_URI "(403\.php¦robots\.txt)$" allowit
#
Order Deny,Allow
Deny from 63.****.xx.xx
Allow from env=allowit
#
RewriteEngine on
#
# Avoid re-redirecting requests for 403 error page or robots.txt
[b]RewriteRule ^(403\.php¦robots\.txt)$ - [L][/b]
#
# Block unwelcome user agents
RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule .* - [F]
- IPs (order deny,allow) are actually sent to my 403,
is it better to send bad bots RewriteRule outside my site (e.g.:whateva.htm) as above or to send to 403?
- will this merging of htaccess and mod_rewrite causes heavy load on my server CPU? (i have lot of IPs and a long list of bots).
If you notice a large load when your site is hit by several unwelcome guests simultaneously, then replace your 403.php page with a simple ~700-byte 403.html page, and put only a "Forbidden" message and a simple text link on that page. The text link should lead to a second 403-error "help" page with more information in case you block someone accidentally. *That* page could be php-driven without any load problems, since 'bots won't usually follow the link from your 403 page. If you do that, you'll need to include the 403-help page in the list of pages allowed universal access, i.e. "(403\.html¦403help\.php¦robots\.txt)$"
The point of all this is to make your 403 error page as small and simple as possible in order to minimize server load and wasted bandwidth. The link to the help page is to provide assistance to people you block accidentally (it *will* happen eventually). You can give them an e-mail form or phone number to contact you, or just tell them that all 403 errors are reviewed daily, and please try again tomorrow - your choice.
The line of code I added (bold text) is intended to stop a second 403 error when an unwelcome guest uses a Deny-from-blocked IP address *and* a mod_rewrite-blocked user-agent. Without this line, you will get a server error, because the server will fail when it attempts to serve the 403 page to the visitor from the blocked IP address; Since his user-agent is also blocked, the server won't be able to serve the 403 page unless you stop mod_rewrite processing at this point using "- [L]". Allowing any IP and any user-agent to access your 403 page(s) is necessary. Allowing any IP or user-agent to access robots.txt is also the correct behaviour in almost all cases. If you use a bad-bot script, its name should also be included in the list (both places in the code above), i.e. "(403\.html¦403help\.php¦bad-bot\.php¦robots\.txt)$"
Jim
as per yr suggestion my 403 is now very light, just a touch of php stating the visitorIP (no more hide n seek:) size is 1Kb and there's even an email (at)domain.net which i use for catching the spam.
abt the RewriteRule ^(403\.php¦robots\.txt)$ - [L]
i thought there might be something like that but i wans't sure where at, the stop mod_rewrite processing[L]i had no idea, thanks.
every unwelcome is now going to my 1Kb 403, thanks again Jim for your great help.
tito
what did i miss? thanks.
tito
That RewriteCond looks OK to me. So the problem is elsewhere. Either the RewriteRule is incorrect, you have omitted an [OR] somewhere else in your list, or some other problem exists. Omitting an [OR] is a common problem. If you do accidentally omit any [OR] in the list, it will usually disable the entire list, because mod_rewrite will default to its implicit [AND] function. Since a user-agent cannot have two values simultaneously, the RewriteRule will never be invoked.
Example:
Correct:
RewriteCond %{HTTP_USER_AGENT} ^badguy1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy2 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy3 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy4
RewriteRule !^custom403\.html$ - [F]
[pre]RewriteCond %{HTTP_USER_AGENT} ^badguy1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy2 <- Missing [OR], defaults to "AND"
RewriteCond %{HTTP_USER_AGENT} ^badguy3 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy4
RewriteRule !^custom403\.html$ - [F] [/pre] However, the incorrect example requires that the user-agent start with ("badguy1" OR "badguy2") AND ("badguy3" OR "badguy4") -- This is clearly an impossibility, since the user-agent can only have one value at a time, so the RewriteRule will never be invoked.
Another possible problem is that perhaps your last RewriteCond ("badguy4" in the examples above) has an [OR] on it, which will also cause the ruleset to malfunction.
Jim
i've found these in the middle of my list:
RewriteCond %{HTTP_USER_AGENT} netcraft [NC]
RewriteCond %{REQUEST_URI} /sensepost\.exe [NC]
and changed [NC] to [OR] on both of 'em, the last RewriteCond is empty as it was before.
appearently htaccess is now working, i will check its behave on next requests.
Thanks for your help
tito
the rewrite concerning these rules on my .htaccess is as follows:
RewriteCond %{REQUEST_URI} ^/default\.(ida¦idq) [NC,OR]
RewriteCond %{REQUEST_URI} /(admin¦cmd¦httpodbc¦nsiislog¦root¦shell)\.(dll¦exe) [NC,OR]
i have checked my .htaccess and no [OR] are missing on my whole list except the last rewrite which has to be without it.
please, do you believe i should remove the NC on above rewrites and leave the [OR] only?!?
www.domain1.net 216.****.xx.xx - - [12/Feb/2004:10:46:07 -0500] "GET /default.ida?XXX..XXXX%u909...78%u0000%u00=a HTTP/1.0" 302 621 "-" "-"
and also:
ns1.domain1.net 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
www.domain1.net 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
www.domain2.net 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
www.domain3.org 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
so 302 is redirecting but i can't see where, nothing else on my logs, any idea?!? thanks for your help. tito.
I'd suggest you use the Server headers checker [webmasterworld.com] to find out where these requests are redirected to.
It may be that your hosting company is redirecting these requests before your .htaccess file is processed; Therefore, your .htaccess code can have no effect on these requests.
Jim