Forum Moderators: phranque

Message Too Old, No Replies

htaccess ban + mod_rewrite

make'em working together

         

tito

4:44 pm on Feb 5, 2004 (gmt 0)

10+ Year Member



hello,
so finally, with a great help from Jim, i have my htaccess ban properly working.
i'd like also to mod_rewrite bad bots on it and thought i can merge things like this:

----
SetEnvIf Request_URI "(403\.php¦robots\.txt)$" allowit

Order Deny,Allow
deny from 63.****.xx.xx
Allow from env=allowit

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.*$ [whateva.htm...] [L]
----

questions:
- IPs (order deny,allow) are actually sent to my 403,
is it better to send bad bots RewriteRule outside my site (e.g.:whateva.htm) as above or to send to 403?

- will this merging of htaccess and mod_rewrite causes heavy load on my server CPU? (i have lot of IPs and a long list of bots).

- do you see any mistake on my mod_rewrite part?

thanks so much
tito

jdMorgan

5:13 am on Feb 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



tito,

This is the exact order and method I use on several sites:


SetEnvIf Request_URI "(403\.php¦robots\.txt)$" allowit
#
Order Deny,Allow
Deny from 63.****.xx.xx
Allow from env=allowit
#
RewriteEngine on
#
# Avoid re-redirecting requests for 403 error page or robots.txt
[b]RewriteRule ^(403\.php¦robots\.txt)$ - [L][/b]
#
# Block unwelcome user agents
RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule .* - [F]

- IPs (order deny,allow) are actually sent to my 403,
is it better to send bad bots RewriteRule outside my site (e.g.:whateva.htm) as above or to send to 403?

In the code above, the unwelcome user agents are simply given a 403 response; They are *not* going to follow an external redirect anyway, so why bother? Just 403 them and forget them (simple is good).

- will this merging of htaccess and mod_rewrite causes heavy load on my server CPU? (i have lot of IPs and a long list of bots).

I have one site where the .htaccess code is 28kB in size. There is no noticeable slowdown. The only question I would have in your case is how much impact using a php 403 page will have.

If you notice a large load when your site is hit by several unwelcome guests simultaneously, then replace your 403.php page with a simple ~700-byte 403.html page, and put only a "Forbidden" message and a simple text link on that page. The text link should lead to a second 403-error "help" page with more information in case you block someone accidentally. *That* page could be php-driven without any load problems, since 'bots won't usually follow the link from your 403 page. If you do that, you'll need to include the 403-help page in the list of pages allowed universal access, i.e. "(403\.html¦403help\.php¦robots\.txt)$"

The point of all this is to make your 403 error page as small and simple as possible in order to minimize server load and wasted bandwidth. The link to the help page is to provide assistance to people you block accidentally (it *will* happen eventually). You can give them an e-mail form or phone number to contact you, or just tell them that all 403 errors are reviewed daily, and please try again tomorrow - your choice.

The line of code I added (bold text) is intended to stop a second 403 error when an unwelcome guest uses a Deny-from-blocked IP address *and* a mod_rewrite-blocked user-agent. Without this line, you will get a server error, because the server will fail when it attempts to serve the 403 page to the visitor from the blocked IP address; Since his user-agent is also blocked, the server won't be able to serve the 403 page unless you stop mod_rewrite processing at this point using "- [L]". Allowing any IP and any user-agent to access your 403 page(s) is necessary. Allowing any IP or user-agent to access robots.txt is also the correct behaviour in almost all cases. If you use a bad-bot script, its name should also be included in the list (both places in the code above), i.e. "(403\.html¦403help\.php¦bad-bot\.php¦robots\.txt)$"

Jim

tito

10:05 pm on Feb 6, 2004 (gmt 0)

10+ Year Member



Thank you Jim, i have now tested the final cut of my htaccess and works perfectly, i have also optimized the bots statement and size is now 16.5Kb, next i'm going to implement with exploits rewrite as well.

as per yr suggestion my 403 is now very light, just a touch of php stating the visitorIP (no more hide n seek:) size is 1Kb and there's even an email (at)domain.net which i use for catching the spam.

abt the RewriteRule ^(403\.php¦robots\.txt)$ - [L]
i thought there might be something like that but i wans't sure where at, the stop mod_rewrite processing[L]i had no idea, thanks.

every unwelcome is now going to my 1Kb 403, thanks again Jim for your great help.

tito

tito

12:32 pm on Feb 7, 2004 (gmt 0)

10+ Year Member



Please Jim, i have placed this rewrite:
RewriteCond %{REQUEST_URI} /(admin¦cmd¦httpodbc¦nsiislog¦root¦shell)\.(dll¦exe) [NC,OR]
which is supposed to block also .dll requests (nimda i guess) but i still get response 302 on such requests, like this one:
www.domain.net 200.14.****.xx - - [07/Feb/2004:07:02:13 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"

what did i miss? thanks.

tito

jdMorgan

11:52 pm on Feb 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



tito,

That RewriteCond looks OK to me. So the problem is elsewhere. Either the RewriteRule is incorrect, you have omitted an [OR] somewhere else in your list, or some other problem exists. Omitting an [OR] is a common problem. If you do accidentally omit any [OR] in the list, it will usually disable the entire list, because mod_rewrite will default to its implicit [AND] function. Since a user-agent cannot have two values simultaneously, the RewriteRule will never be invoked.

Example:

Correct:


RewriteCond %{HTTP_USER_AGENT} ^badguy1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy2 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy3 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy4
RewriteRule !^custom403\.html$ - [F]

Incorrect:
[pre]RewriteCond %{HTTP_USER_AGENT} ^badguy1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy2 <- Missing [OR], defaults to "AND"
RewriteCond %{HTTP_USER_AGENT} ^badguy3 [OR]
RewriteCond %{HTTP_USER_AGENT} ^badguy4
RewriteRule !^custom403\.html$ - [F] [/pre]

The correct example requires that the user-agent starts with "badguy1" OR "badguy2" OR "badguy3" OR "badguy4", so the RewriteRule will be invoked if the user-agent starts with any one of those character strings.

However, the incorrect example requires that the user-agent start with ("badguy1" OR "badguy2") AND ("badguy3" OR "badguy4") -- This is clearly an impossibility, since the user-agent can only have one value at a time, so the RewriteRule will never be invoked.

Another possible problem is that perhaps your last RewriteCond ("badguy4" in the examples above) has an [OR] on it, which will also cause the ruleset to malfunction.

Jim

tito

10:41 am on Feb 9, 2004 (gmt 0)

10+ Year Member



Hi Jim,

i've found these in the middle of my list:
RewriteCond %{HTTP_USER_AGENT} netcraft [NC]
RewriteCond %{REQUEST_URI} /sensepost\.exe [NC]
and changed [NC] to [OR] on both of 'em, the last RewriteCond is empty as it was before.

appearently htaccess is now working, i will check its behave on next requests.

Thanks for your help
tito

tito

12:11 pm on Feb 12, 2004 (gmt 0)

10+ Year Member



Hello Jim,
as today i'm still having response 302 on "GET /default.ida?X.." and "GET /scripts/nsiislog.dll"

the rewrite concerning these rules on my .htaccess is as follows:
RewriteCond %{REQUEST_URI} ^/default\.(ida¦idq) [NC,OR]
RewriteCond %{REQUEST_URI} /(admin¦cmd¦httpodbc¦nsiislog¦root¦shell)\.(dll¦exe) [NC,OR]

i have checked my .htaccess and no [OR] are missing on my whole list except the last rewrite which has to be without it.
please, do you believe i should remove the NC on above rewrites and leave the [OR] only?!?

jdMorgan

6:35 am on Feb 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That won't make any difference. You need to find out why you are getting a 302 redirect.
Look at your server errors log file for more information.
Since you did not say where the 302 response redirected to, I can't offer any more than that.

Jim

tito

7:37 pm on Feb 14, 2004 (gmt 0)

10+ Year Member



checked my logs to find where the 302 response redirected to, all i can see is:

www.domain1.net 216.****.xx.xx - - [12/Feb/2004:10:46:07 -0500] "GET /default.ida?XXX..XXXX%u909...78%u0000%u00=a HTTP/1.0" 302 621 "-" "-"

and also:

ns1.domain1.net 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
www.domain1.net 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
www.domain2.net 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"
www.domain3.org 213.xxx.xxx.xxx - - [11/Feb/2004:15:35:41 -0500] "GET /scripts/nsiislog.dll" 302 - "-" "-"

so 302 is redirecting but i can't see where, nothing else on my logs, any idea?!? thanks for your help. tito.

jdMorgan

8:56 pm on Feb 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



tito,

I'd suggest you use the Server headers checker [webmasterworld.com] to find out where these requests are redirected to.

It may be that your hosting company is redirecting these requests before your .htaccess file is processed; Therefore, your .htaccess code can have no effect on these requests.

Jim