Forum Moderators: phranque
I'm trying to ban sites by domain name, since there are recently lots of reference spammers.
I have, for example, the rule:
RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*stuff.*\.com/.*$ [NC]
RewriteRule ^.*$ - [F,L]
which should ban any sites containing the word "stuff"
www.stuff.com
www.whatkindofstuff.com
www.some-other-stuff.com
and so on.
However, it is not working, so I am sure I did not setup a proper pattern match rule. Anyone care to advise?
[edited by: jatar_k at 5:06 am (utc) on May 20, 2003]
In the meantime I am happy with the uncompressed formmail ban ruleset. It goes by matching the filename and extension, whereas the compressed version doesn't care about extensions.
This time I'll test it in my experimental directory so as to not ban myself or others.
Wiz
My problem, unique to my website, was that only the long rule-set blocked requests for variations of the word Form and/or Mail scripts, while allowing html pages with the same names to pass through and be displayed (desired action). The compressed version also blocked requests for such pages of mine as formmailwarning.html, because it contains the Regexps "form" and "mail," and it does not check for filename extensions before blocking them.
I researched the problem and after much testing I have come up with a solution that causes the compressed version to behave the same as the long version.
[i]Balam's FormMail Request RewriteRules (157 bytes):[/i]
RewriteCond %{REQUEST_URI} (.?mail.?form¦form¦(GM)?form.?.?mail¦.?mail)(2¦to)?\.?(asp¦cgi¦exe¦php¦pl¦pm)?$ [NC]
RewriteRule .* /path_to/trap_script.pl [L][i]jbMorgan's compressed rules (75 bytes):[/i]
RewriteRule (form.*mail¦mail.*(form¦to¦2)) /path_to/trap_script.pl [NC,L]
[i]jbMorgan's compressed rules with Wizcrafts modification (121 bytes):[/i]
RewriteCond %{REQUEST_URI}!.*\.(html¦js¦css)
RewriteRule (form.*mail¦mail.*(form¦to¦2)) /path_to/trap_script.pl [NC,L]
If you carefully compare the three rule-sets you can see that
1) the first set blocks filenames by matching filenames with known dangerous file extensions.
2) The second one blocks by prefix match only, with no concern for what the extension is.
3) The third rule-set allows filenames with the desired extensions to pass, but blocks prefix matches on any other extensions.
While the? at the end of the 1st rule-set means that the extensions are optional, it works for me as desired nonetheless. The third, compressed rule-set accomplishes the same result with a savings of 36 bytes. If you do not have any filenames on your server that include "form" or "mail," you can leave off the RewriteCond line, as per rule-set #2, and it will block bad guys just fine, with a savings of an additional 46 bytes (82 bytes total saved).
As usual, anybody copying and pasting this code should replace the broken pipe characters with solid pipes from their own keyboard.
Submitted IMHO, after testing on my server with files that exist, and otherwise, by Wiz
<Files *>
order allow,deny
allow from all
allow from env=allowit
deny from env=ban
deny from 12.219.232.74
deny from 24.53.200.12
<snip>
</Files>
However, after running a lot of tests I realized that the <Files> order is bad. The allowit rules were not taking effect at all. Here is how I corrected that problem:
<Files *>
order deny,allow
deny from env=ban
deny from 12.219.232.74
deny from 24.53.200.12
(snip)
allow from env=allowit
# allow from all # apparently not needed
</Files>
With this sequencing I am able to ban unwanted visitors, ie: SetEnvIf Remote_Addr ^216\.229\.194\.253$ ban, while allowing good guys, like myself or Wannabrowser to continue to access my website, even if I/they hit a banned script (on purpose). The allowit rule for Wannabrowser is: SetEnvIf Remote_Addr ^206\.194\.114\.2$ allowit. By placing "allow from env=allowit" at the bottom of the "deny from" rule-set, we can ensure that friends are not inadvertantly banned. Note that with this order (order deny,allow), the deny rules are processed first, then the allow rules kick in. Anything not specifically banned (or that is allowed by "allowit") is allowed through this rule-set; "Allowit" overrides "ban" for the same IP address.
Lastly, if you have a custom 403 page, and a robots.txt that you want the banned visitors to see, you have to add this allowit line:
SetEnvIf Request_URI (/includes/403\.html¦/robots\.txt)$ allowit. Do NOT put a ^ in front of this code-line. It will prevent your cusom 403 from being accessed and cause an error message to appear in the generic 403 message that does display. That was my experience and removing the ^ fixed it. Wiz
Note that this rule-set only applies to Files, not folders.
One neat thing about Order is that only the settings allow,deny or deny,allow are important. You can have your Deny from and Allow from directives in any order - They will be processed as dictated by the Order directive, not by the order they appear in. So, it is unnecessary to re-arrange your denies and allows just because you change your Order directive. This can make the code block easier to maintain and document, too.
Part of the confusion is caused by the Order name itself. It really indicates the precedence or priority between Allow and Deny directives and has nothing to do with the listing order of denies and allows.
Jim
it is unnecessary to re-arrange your denies and allows just because you change your Order directive.
Thanks for that info Jim. I will keep on listing things in logical order anyway, because I am used to doing things that way.
Oh, BTW, I added email notification to my trap script. It includes the IP address, the time banned, the filename and path request that triggered the script, the User Agent and the Method (GET or POST) used. I also tried to add an http_referer field but it causes 500 server errors when I include it in the Perl script, so it is out.
Wiz
That's strange... Here's the syntax I'm using that writes a log file file including HTTP_REFERER. If you've got something different, maybe something like this might help.
$reqmthd = $ENV{'REQUEST_METHOD'};
$reqhost = $ENV{'HTTP_HOST'};
$requri = $ENV{'REQUEST_URI'};
$referer = $ENV{'HTTP_REFERER'};
$usragnt = $ENV{'HTTP_USER_AGENT'};
<snip>
print HTMLOG ("<br><b>$remaddr</b> banned $date $reqmthd $reqhost$requri \"$referer\" \"$usragnt\"\n");
$referer = $ENV{'HTTP_REFERER'}; in the script and calling it to for inclusion in the email, while allowing the email to be sent, still gives a 500 server error upon exiting. Here is my code for the email function, borrowed from one posted by another forumite a while ago:
$remreq = $ENV{REQUEST_URI};
$remaddr = $ENV{REMOTE_ADDR};
$usragnt = $ENV{HTTP_USER_AGENT};
$remmeth = $ENV{REQUEST_METHOD};
$remhost = $ENV{HTTP_HOST};
$referer = $ENV{'HTTP_REFERER'};
$date = scalar localtime(time);open(MAIL, "¦/usr/sbin/sendmail -t") ¦¦ die "Content-type: text/text\n\nCan't open /usr/sbin/sendmail!";
print MAIL "To: xxx\@xxx\.xxx\n";
print MAIL "From: xxx\@xxx\.xxx\n";
print MAIL "Subject: You caught another one!\n";
print MAIL "The ip address: $remaddr was banned on $date \n";
print MAIL "The file requested was: $remreq\n";
print MAIL "The method used was: $remmeth\n";
print MAIL "The intruder's user agent was: $usragnt\n";
print MAIL "The remote host was: $remhost\n";
print MAIL "The referrer was: $referer\n";
# The above line's referer variable causes a 500 server error
close(MAIL);
exit;
The results are emailed to me with a blank referrer variable, and I get a 500 server error on screen.
Wiz
Wiz