Forum Moderators: phranque
I found this script to block all Formail request.
RewriteCond %{REQUEST_URI} ^/(cgi\-bin/¦cgi\-local/)\FormMail.(cgi¦php¦pl) [NC,OR]
RewriteCond %{REQUEST_URI} ^/(cgi\-bin/¦cgi\-local/)\FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail.(cgi¦php¦pl) [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} (mail.?form¦form¦form.?mail¦mail¦mailto)\.(cgi¦exe¦pl)$ [NC]
RewriteRule .* /cgi-bin/Formmail_trap.pl [L]
1) Is there anything missing?
2) Is it obliged to escape the sign "-" cgi\-bin?
(Formmail_trap.pl is the famous trap.pl:
Ban malicious visitors with this Perl script [webmasterworld.com])
I can't - and don't - take credit for all the above "code," (let's hear it for community efforts!), but I think I was the one who introduced the the "cgi\-bin" part (that is, escaping the dash with a backslash.)
It has always been a habit of mine to escape all dashes when I mean a dash so that those who follow me will know that I mean an actual dash and not a range, even if, as in this case, there is no logical range. (A range from "i" to "b"? Huh?! That ain't logical...)
In your first & second RewriteCond, I don't think you need the backslash that appears before "FormMail"...
Since I'm not a Rewrite Wizard, I'd have to test the second RewriteCond you have... I do believe it would catch your call to the trap in your RewriteRule, creating a loop. The (second) RewriteCond is looking for /cgi-bin/FormMail and the trap's URL is /cgi-bin/Formmail_trap.pl...
If I'm wrong, someone will gently beat me with the correct answer, and you'll get to sit back and watch the fireworks...
RewriteCond %{REQUEST_URI} ^/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail\.(cgi¦pl¦php) [NC,OR]
RewriteCond %{REQUEST_URI} ^/cgi(\-local¦\-bin)/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/cgi(\-local¦\-bin)/FormMail\.(cgi¦pl¦php) [NC,OR]
---
Also beware about using a script to block IP's gathered from "attacks" on formmail...
...I went over my "Last 300 Visitors" and found this:
Host: 80.58.5.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:34 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -
Host: 208.176.83.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:26 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -
Host: 64.2.137.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:24 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -
Host: 217.45.133.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:23 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -
Host: 207.61.246.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:22 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -
Host: 208.231.0.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:20 Http Version: HTTP/1.1" Size in Bytes: 770
Referer: http://www.<SNIP>.com/ Agent: -
Host: 213.69.58.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:09 Http Version: HTTP/1.1" Size in Bytes: 770
Referer: http://www.<SNIP>.com/ Agent: -
Host: 212.69.40.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:08 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -
Host: 66.92.152.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:01 Http Version: HTTP/1.1" Size in Bytes: 770
Referer: http://www.<SNIP>.com/ Agent: -
---
2 Things:
1) The Referer...I don't have any formmail script, so I find it interesting that somehow my site (domain replaced with <SNIP>) refered the formmail
2) Look at the times; rapid attack from various sources. I checked the IP's and they are US, Canadian and UK. I don't believe all these people decided to look for formmail.pl on my system within a total elapsed time of 33 seconds. Looks more like slaves infected with some type of trojan and forced to check for the formmail when the infector told them to (in hopes of using my site to send junk mail in rapid succession). Thus if I block them, I could be blocking innocent people who are unaware their computer is infected.
Regards!
---
[Edit] Removed last IP number and replaced with x to "hide" end-user... :)
The multiple sequential requests from different IPs are likely from either worm-infected "zombie" machines, or they are coming in through open proxies. You could try to capture and log {HTTP_VIA} and {HTTP_X_FORWARDED_FOR} with a script if you want to investigate.
Jim
[edited by: jdMorgan at 11:46 pm (utc) on Feb. 3, 2004]
Not aware of:
{HTTP_VIA} and {HTTP_X_FORWARDED_FOR}
I definately would be interested; as you already know I'm working on just a PHP file for that but would only give limited info (HTTP_REFERER, USER_AGENT, IP, etc.).
Thanks!
ln -s formmail.pl formmail.cgi
ln -s formmail.pl FormMail.cgi
ln -s formmail.pl FormMail.pl
FormMail.cgi -> formmail.pl
FormMail.pl -> formmail.pl
formmail.cgi -> formmail.pl
formmail.pl
That's what I did and it works fine. Of course, I haven't seem the variations that you test for.
I don't remember where I found this script. Perhaps it's yours, balam.
1) But don't you think, if you escape all special characters, that the script will run correctly?
If you write cgi\-bin it doesn't mean cgi-bin but cgi + \ + - + bin which is not correct.
2) You said I make a loop for all request /cgi-bin/Formmail if the RewriteRule is /cgi-bin/Formmail_trap.pl. Why?
3) As you, decdim, I haven't Formmail application? You are right, an innocent visitor may be infected with a trojan.
I'm going to send the request on an appropriated page.
4) What do you (everyone) think about the last condition?
RewriteCond %{REQUEST_URI} (mail.?form¦form¦form.?mail¦mail¦mailto)\.(cgi¦exe¦pl)$ [NC]
Since the \ (backslash) is an escape character (meaning that when you use a backslash you are saying "I mean the next character in the string literally, and not any special meaning it may have"), you would need to use two backslashes ("cgi\\-bin") if you wanted to match "cgi" followed by a "backslash" followed by a "dash" followed by "bin".
> You said I make a loop for all request /cgi-bin/Formmail if the RewriteRule is /cgi-bin/Formmail_trap.pl. Why?
Your second RewriteCond...
RewriteCond %{REQUEST_URI} ^/(cgi\-bin/¦cgi\-local/)\FormMail [NC,OR]
...is looking for any request that starts off with "/cgi-bin/FormMail". Your trap script starts off with this very string: /cgi-bin/Formmail_trap.pl
Jim didn't mention it this time, so I will: Using Regular Expressions [etext.lib.virginia.edu]
Text:
. Any single character
[chars] Character class: One of chars
[^chars] Character class: None of chars
text1¦text2 Alternative: text1 or text2
Quantifiers:
? 0 or 1 of the preceding text
* 0 or N of the preceding text (N > 0)
+ 1 or N of the preceding text (N > 1)
Grouping:
(text) Grouping of text
(either to set the borders of an alternative or
for making backreferences where the Nth group can
be used on the RHS of a RewriteRule with $N)
Anchors:
^ Start of line anchor
$ End of line anchor
Escaping:
\char escape that particular char
(for instance to specify the chars ".[]()" etc.)
Then, "_" or "-" are not part of them. May I write:
RewriteCond %{HTTP_USER_AGENT} .*efp@gmx\.net* [OR] rather then
RewriteCond %{HTTP_USER_AGENT} .*efp\@gmx\.net* [OR]
or
RewriteCond %{HTTP_USER_AGENT} .*Go!Zilla* [OR] rather then
RewriteCond %{HTTP_USER_AGENT} .*Go\!Zilla* [OR]
or
RewriteCond %{REQUEST_URI} ^/cgi(-local¦-bin)/FormMail [NC,OR] rather then
RewriteCond %{REQUEST_URI} ^/cgi(\-local¦\-bin)/FormMail [NC,OR]
You're right that "_" isn't, but "-" sometimes is, sometimes isn't - it depends on the context in which it is used. (Actually, that argument can be made - and is valid - for all the "special" characters.)
The "-" is used to denote a range of characters. Instead of typing "[0123456789]", you can type "[0-9]". "[a-zA-Z0-9]" matches (one of) 62 different characters: The alphabet in lower & uppercase and the numbers from 0 to 9.
As I said before, it's just a personal habit of mine to escape dashes when I mean dashes and not ranges. (It probably creates as much confusion for some as it reduces for others... :) In the future, I'll probably remove them when posting.)
> May I write:
The first version of each of your RewriteConds are fine, with a couple of small caveats...
> RewriteCond %{HTTP_USER_AGENT} .*efp@gmx\.net* [OR]
> RewriteCond %{HTTP_USER_AGENT} .*Go!Zilla* [OR]
Both of these end with an "*". This means, using the first one as an example, that you are looking for "zero or more of any character" followed by "efp@gmx.ne" followed by "zero or more of the letter 't'". This probably not what you want, and both cases you'd be better off to remove the final asterisk.
Consider this:
Mozilla/4.0blablabla./Indy Library/v2.01/blablabla
Which is interesting to exclude is Indy Library, no matter the version. For a RewriteRule to 403, we have to find the exact name in a longer string.
In a RewriteCond, is there a difference between:
^Indy\ Library$
or
Indy\ Library
Is there an advantage to write it on a maner or on an other (Is one fastest in execution, more sure...?)
In a RewriteCond, is there a difference between:^Indy\ Library$
or
Indy\ Library
The pattern "Indy\ Library" matches any string containing "Indy Library", including the exact string "Indy Library".
^ and $ are "start" and "end anchors", respectively. See this regular expressions tutorial [etext.lib.virginia.edu].
Specifying exact, fully-anchored patterns makes regular-expressions processing much faster, but in cases like this, you don't have a choice. A good pattern might be:
^Mozilla/.*Indy\ Library/ Jim
This part is very important for webmasters who find in their logs a USER_AGENT they don't want the visit.
A specific U_A has many versions it is not friendly to write in a list of RewriteCond.
More often the number and the lenght of a string containing "Indy Library" will slowdown the server. In consequence we need to extract the specifc U_A, no matter the version it has or what is its compatibility.
In ours logs it is easy to extract this agent of a long string and write, as you said, simply :
^Mozilla/.* Indy\ Library \L
But, before that you traped all unwanted U_A, you should want to prevent a visit from them. A lot of RewriteCond U_A lists can be found in WebmasterWorld. Unfortunatly all of them are not 100% fullproof because the webmaster doesn't know exactly the good syntax. And when he writes:
^Indy Library$
He ought to write:
^Mozilla/.* Indy\ Library/
With also a backslash in "Indy\ Library" . But imagine we found the name of this U_A on the net and we don't know exactly what are the caracters before and after or if it is standing alone without anything else. As we are interested to include it on our list before it visits our site, we must find the better way to recognize it.
If we put:
.* Indy\ Library perhaps we are right.
If we put
^Indy\ Library $ we are wrong and Apache will not recognize it.
The question is : how can we write correctly without knowing the full name of the U_A.
I follow you, Jim, and if I don't make mistakes we can take this USER_AGENT as exemple (it's not the real one) and try to find it in a RewriteCond. If the exact string "Indy Library" is:
Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET
If we write :
^Indy\ Library$ =we don't find it. (rest of the string is ignored)
^Indy\ Library =we don't find it (Indy Library-v.5 compatible I.E 5.5-NET)
Indy\ Library =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.*$ =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^Mozilla/.* Indy\ Library- =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET) but in an other version the "-" can be change in "/"
.*Indy\ Library* =we find it. But in this case there is no sence to look after zero or more of the letter "y" (Mozilla/v.01/ Indy Librar) and(y)
You are obliged to imagine and make you writing as large as possible to cath "Indy Library".
It seems that ^.*Indy\ Library.*$ is a correct way to do.
Any correction?
^Indy\ Library$ =we don't find it. (rest of the string is ignored)
^Indy\ Library =we don't find it (Indy Library-v.5 compatible I.E 5.5-NET)
Indy\ Library =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.*$ =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^Mozilla/.* Indy\ Library- =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET) but in an other version the "-" can be change in "/"You are obliged to imagine and make you writing as large as possible to cath "Indy Library".
It seems that ^.*Indy\ Library.*$ is a correct way to do.
There is no difference between the regular-expressions
Indy\ Library ^.*Indy\ Library.*$ Corrections:
^Indy\ Library$ =we don't find it. (exact match required)
^Indy\ Library =we do find it ("Indy Library-v.5 compatible I.E 5.5-NET" starts with specified pattern)
Indy\ Library =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^.*Indy\ Library$ =we do find it ("Mozilla/v.01/ Indy Library" ends with specified pattern)
^.*Indy\ Library.* =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^.*Indy\ Library.*$ =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
.*Indy\ Library$ =we do find it ("Mozilla/v.01/ Indy Library" ends with specified pattern)
.*Indy\ Library.* =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^.*Indy\ Library.* =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^Mozilla/.* Indy\ Library- =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET) but in an other version the "-" can be change in "/" (so leave off the "-")
Ref: [etext.lib.virginia.edu...]
Jim