Forum Moderators: phranque
I'm trying to ban sites by domain name, since there are recently lots of reference spammers.
I have, for example, the rule:
RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*stuff.*\.com/.*$ [NC]
RewriteRule ^.*$ - [F,L]
which should ban any sites containing the word "stuff"
www.stuff.com
www.whatkindofstuff.com
www.some-other-stuff.com
and so on.
However, it is not working, so I am sure I did not setup a proper pattern match rule. Anyone care to advise?
[edited by: jatar_k at 5:06 am (utc) on May 20, 2003]
# Block libwww-perl except from AltaVista, Inktomi, and IA Archiver
RewriteCond %{HTTP_USER_AGENT} ^libwww-perl/[0-9] [NC]
RewriteCond %{REMOTE_ADDR}!^209\.73\.(1[6-8][0-9]¦19[01])\.
RewriteCond %{REMOTE_ADDR}!^209\.131\.(3[2-9]¦[45][0-9]¦6[0-3])\.
RewriteCond %{REMOTE_ADDR}!^209\.237\.23[2-5]\.
RewriteRule!^err403\.htm$ - [F]
# Block Java and Python URLlib except from Google
RewriteCond %{HTTP_USER_AGENT} ^(Python.urllib¦Java/?[1-9]\.[0-9]) [NC]
RewriteCond %{REMOTE_ADDR}!^216\.239\.(3[2-9]¦[45][0-9]¦6[0-3])\.
can anyone tell why the first hit gets a 200 and the second is 404? and what I need to do to correct it so both are 404?
65.49.178.17 - - [17/Sep/2003:10:40:52 -0400] "GET /xxx.htm HTTP/1.1" 200 14724 "-" "xxxxxxxxx_xxxxxxxx/0.1 libwww-perl/5.65"
65.49.178.17 - - [17/Sep/2003:10:34:02 -0400] "GET /xxxxxx/- HTTP/1.1" 404 7550 "-" "xxxxxxxxx_xxxxxxxx/0.1 libwww-perl/5.65"
thanks
I'd guess that there are no restrictions imposed on 65.49.178.17. The 404 was due to the fact that the document being retrieved was /xxxxxx/-, which is a malformed address. The 200 was due to the fact that /xxx.htm existed, and there were no access restrictions on 65.49.178.17.
I'm guessing you want to correct both log file entries so that a 403 status code (Forbidden) is returned. In that case, I'd change this line:
RewriteCond %{HTTP_USER_AGENT} ^libwww-perl/[0-9] [NC]
RewriteCond %{HTTP_USER_AGENT} libwww-perl/[0-9] [NC]
If you really do want to send back a 404, you'd have to modify the RewriteRule to use the R flag with a status code of 404. Since you want to block unwanted users, though, my guess was that you actually wanted to send back a 403.
I looked and looked at the code and it just didn't sink in and I completely missed the "-". Could be all the problems with my site/host over the last weeks and I'm brain dead from looking for problems - or - it could be I'm just as blind as a bat.
Either case, thank you so much for the clear explanation! :)
In my .htaccess file I have applied all of the suggestions found throughout this thread and everything is working fine.
One of the thorns in my side has been FormMail Phishers, so I have taken the now-famous Trap.pl, customized it and renamed it "formmail.pl," allowed that file in my RewriteRule and it works fine to ban Phishers. Most Phishers come at me about 8 to 10 times in a row, with various spellings, extensions, and directory names, but have always been caught in my trap when they type in "formmail.pl"
However, while reading yesterday's web log I found a FormMail phisher that evaded my trap by only looking for variations of this exact spelling: cgi-bin/FormMail.pl (and .cgi). My ban-bad-bots trap is named formmail.pl and was not triggered because it is all lowercase, but he did get 403's by my RewrightCond for
form.?mail [nc,or]. I tried adding this line to my .htaccess but it does not redirect the request to formmail.pl:
RedirectMatch permanant cgi-bin/FormMail\.pl cgi-bin/formmail.pl. Can anybody help me straighten out the error so I can forward requests for "FormMail.pl" to "formmail.pl"? If I figure it out first I will post the working code-line later.
TIA, Wiz
Wizcrafts: You should replace permanant with permanent. You could also use 301 instead.
Here is the applicable RewriteCond and RewriteRule affecting FormMail:
RedirectMatch 301 cgi-bin/FormMail\.pl cgi-bin/formmail.plOptions +FollowSymLinks
RewriteEngine On
RewriteCond %{REQUEST_URI} formmail\.(cgi¦php)$ [NC]
RewriteRule!^(includes/403\.html¦cgi-bin/MKCounter\.cgi¦robots\.txt¦contact-info\.html¦kissthis\.html¦cgi-bin/contact-info\.cgi¦cgi-bin/contact-list\.pl¦cgi-bin/banbadbots\.cgi¦cgi-bin/formmail\.pl¦cgi-bin/FormMail\.pl¦bait/honeypot\.html¦bait/\w*\.html¦bait/contact-info\.cgi) - [F]
I figured it out myself!
When a request comes for a file in my cgi-bin and I tried to redirect that to cgi-bin/formmail.pl, I was actually telling the searcher to look in cgi-bin/cgi-bin/ for formmail.pl. I got it to work by dropping the cgi-bin/ in the destination file!
Wiz
[edited by: Wizcrafts at 7:59 pm (utc) on Sep. 18, 2003]
From [httpd.apache.org ]:
RedirectMatch
Syntax: RedirectMatch [status] regex URL
I was going to say that you should change your third input to something that starts with http://.
Added: Never mind. I got confused between URL and URI.
[edited by: closed at 8:08 pm (utc) on Sep. 18, 2003]
Options +FollowSymLinks
RewriteEngine On
RewriteCond %{REQUEST_URI} (.?mail.?form¦form¦(GM)?form.?.?mail¦.?mail)(2¦to)?\.?(asp¦cgi¦exe¦php¦pl¦pm)?$ [NC,OR]
RewriteRule .* /path-to/bad-bot-script.pl [L] If your bad-bot script bans them anyway, it seems odd to have an additional level of banning. Then it would seem more efficient to just dump them directly in the trap, thus getting them banned instantly.
My condition for formmail catches a few more than the one you posted, it's documented by balam here (msg #6): [webmasterworld.com...]
/claus
[edited by: claus at 8:05 pm (utc) on Sep. 18, 2003]
RedirectMatch cgi-bin/FormMail.pl formmail.pl
This stuff can drive you nuts trying to get exact paths and syntax. I read the Apache docs and still had to figure it out by trial and error (heavy on the error side)
Thanks Claus, I'll try to implement that
Wiz
Does anybody know how to edit the trap script to only add an IP address once, forever, no matter how many times they land on the ban script?
Here is the banning script section in question:
# trap.pl: upload in ASCII mode and CHMOD 755.# This is the only variable that needs to be modified. Replace it with the absolute path to your root directory.
$rootdir = "$ENV{DOCUMENT_ROOT}";
# Grab the IP of the bad bot
$visitor_ip = $ENV{'REMOTE_ADDR'};
$visitor_ip =~ s/\./\\\./gi;
# Open .htaccess file
open(HTACCESS,"".$rootdir."/\.htaccess") ¦¦ die $!;
@htaccess = <HTACCESS>;
close(HTACCESS);
# Write banned IP to .htaccess file
open(HTACCESS,">".$rootdir."/\.htaccess") ¦¦ die $!;
print HTACCESS "SetEnvIf Remote_Addr \^".$visitor_ip."\$ ban\n";
foreach $deny_ip (@htaccess) {
print HTACCESS $deny_ip;
}
close(HTACCESS);
I think the multiple listings occur because I have allowed access to formmail.pl and my trap script in my master rewrite rule. Maybe I can now remove that allowance since the formmail rule is totally separate. I'll see.