Forum Moderators: phranque

Message Too Old, No Replies

.htaccess blocking Formail request

How to block all Formail request

         

Maleville

11:23 pm on Feb 2, 2004 (gmt 0)

10+ Year Member



Hello everybody

I found this script to block all Formail request.

RewriteCond %{REQUEST_URI} ^/(cgi\-bin/¦cgi\-local/)\FormMail.(cgi¦php¦pl) [NC,OR]
RewriteCond %{REQUEST_URI} ^/(cgi\-bin/¦cgi\-local/)\FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail.(cgi¦php¦pl) [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} (mail.?form¦form¦form.?mail¦mail¦mailto)\.(cgi¦exe¦pl)$ [NC]
RewriteRule .* /cgi-bin/Formmail_trap.pl [L]

1) Is there anything missing?
2) Is it obliged to escape the sign "-" cgi\-bin?

(Formmail_trap.pl is the famous trap.pl:
Ban malicious visitors with this Perl script [webmasterworld.com])

balam

12:34 am on Feb 3, 2004 (gmt 0)

10+ Year Member



Hi Maleville,

I can't - and don't - take credit for all the above "code," (let's hear it for community efforts!), but I think I was the one who introduced the the "cgi\-bin" part (that is, escaping the dash with a backslash.)

It has always been a habit of mine to escape all dashes when I mean a dash so that those who follow me will know that I mean an actual dash and not a range, even if, as in this case, there is no logical range. (A range from "i" to "b"? Huh?! That ain't logical...)

In your first & second RewriteCond, I don't think you need the backslash that appears before "FormMail"...

Since I'm not a Rewrite Wizard, I'd have to test the second RewriteCond you have... I do believe it would catch your call to the trap in your RewriteRule, creating a loop. The (second) RewriteCond is looking for /cgi-bin/FormMail and the trap's URL is /cgi-bin/Formmail_trap.pl...

If I'm wrong, someone will gently beat me with the correct answer, and you'll get to sit back and watch the fireworks...

decdim

3:40 am on Feb 3, 2004 (gmt 0)



I agree with the slash before the Formmail in the 1st and second; here is the snippet from my rewritten .htaccess:

RewriteCond %{REQUEST_URI} ^/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail\.(cgi¦pl¦php) [NC,OR]
RewriteCond %{REQUEST_URI} ^/cgi(\-local¦\-bin)/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/cgi(\-local¦\-bin)/FormMail\.(cgi¦pl¦php) [NC,OR]

---
Also beware about using a script to block IP's gathered from "attacks" on formmail...
...I went over my "Last 300 Visitors" and found this:

Host: 80.58.5.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:34 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -

Host: 208.176.83.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:26 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -

Host: 64.2.137.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:24 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -

Host: 217.45.133.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:23 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -

Host: 207.61.246.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:22 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -

Host: 208.231.0.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:20 Http Version: HTTP/1.1" Size in Bytes: 770
Referer: http://www.<SNIP>.com/ Agent: -

Host: 213.69.58.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:09 Http Version: HTTP/1.1" Size in Bytes: 770
Referer: http://www.<SNIP>.com/ Agent: -

Host: 212.69.40.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:08 Http Version: HTTP/1.0" Size in Bytes: 758
Referer: http://www.<SNIP>.com/ Agent: -

Host: 66.92.152.x Url: /cgi-bin/formmail.pl Http Code : 403
Date: Feb 02 04:42:01 Http Version: HTTP/1.1" Size in Bytes: 770
Referer: http://www.<SNIP>.com/ Agent: -

---
2 Things:

1) The Referer...I don't have any formmail script, so I find it interesting that somehow my site (domain replaced with <SNIP>) refered the formmail

2) Look at the times; rapid attack from various sources. I checked the IP's and they are US, Canadian and UK. I don't believe all these people decided to look for formmail.pl on my system within a total elapsed time of 33 seconds. Looks more like slaves infected with some type of trojan and forced to check for the formmail when the infector told them to (in hopes of using my site to send junk mail in rapid succession). Thus if I block them, I could be blocking innocent people who are unaware their computer is infected.

Regards!
---
[Edit] Removed last IP number and replaced with x to "hide" end-user... :)

jdMorgan

9:22 pm on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your own domain did not refer these requests, their script simply copied your domain name and inserted it into the HTTP-Referer header.

The multiple sequential requests from different IPs are likely from either worm-infected "zombie" machines, or they are coming in through open proxies. You could try to capture and log {HTTP_VIA} and {HTTP_X_FORWARDED_FOR} with a script if you want to investigate.

Jim

[edited by: jdMorgan at 11:46 pm (utc) on Feb. 3, 2004]

decdim

10:01 pm on Feb 3, 2004 (gmt 0)



Can this "script" be posted or stickymail it to me?

Not aware of:

{HTTP_VIA} and {HTTP_X_FORWARDED_FOR}

I definately would be interested; as you already know I'm working on just a PHP file for that but would only give limited info (HTTP_REFERER, USER_AGENT, IP, etc.).

Thanks!

BohrMe

1:32 am on Feb 5, 2004 (gmt 0)

10+ Year Member



Why not just name your Formmail_trap.pl trap formmail.pl? Also, if you're on a Unix-based machine, you can create soft links to the other variations. E.g.,

ln -s formmail.pl formmail.cgi
ln -s formmail.pl FormMail.cgi
ln -s formmail.pl FormMail.pl

FormMail.cgi -> formmail.pl
FormMail.pl -> formmail.pl
formmail.cgi -> formmail.pl
formmail.pl

That's what I did and it works fine. Of course, I haven't seem the variations that you test for.

Maleville

7:00 pm on Feb 5, 2004 (gmt 0)

10+ Year Member



Thank's all for the answers.

I don't remember where I found this script. Perhaps it's yours, balam.

1) But don't you think, if you escape all special characters, that the script will run correctly?
If you write cgi\-bin it doesn't mean cgi-bin but cgi + \ + - + bin which is not correct.

2) You said I make a loop for all request /cgi-bin/Formmail if the RewriteRule is /cgi-bin/Formmail_trap.pl. Why?

3) As you, decdim, I haven't Formmail application? You are right, an innocent visitor may be infected with a trojan.
I'm going to send the request on an appropriated page.

4) What do you (everyone) think about the last condition?
RewriteCond %{REQUEST_URI} (mail.?form¦form¦form.?mail¦mail¦mailto)\.(cgi¦exe¦pl)$ [NC]

jdMorgan

5:24 am on Feb 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



[quote]I haven't Formmail application[/quote

In that case, keep things simple. Simple is good: :)


RewriteRule (mail.*)form¦(form.*)mail¦mailto [NC,F]

Jim

balam

12:13 am on Feb 8, 2004 (gmt 0)

10+ Year Member



> If you write cgi\-bin it doesn't mean cgi-bin but cgi + \ + - + bin which is not correct.

Since the \ (backslash) is an escape character (meaning that when you use a backslash you are saying "I mean the next character in the string literally, and not any special meaning it may have"), you would need to use two backslashes ("cgi\\-bin") if you wanted to match "cgi" followed by a "backslash" followed by a "dash" followed by "bin".

> You said I make a loop for all request /cgi-bin/Formmail if the RewriteRule is /cgi-bin/Formmail_trap.pl. Why?

Your second RewriteCond...

RewriteCond %{REQUEST_URI} ^/(cgi\-bin/¦cgi\-local/)\FormMail [NC,OR]

...is looking for any request that starts off with "/cgi-bin/FormMail". Your trap script starts off with this very string: /cgi-bin/Formmail_trap.pl

Jim didn't mention it this time, so I will: Using Regular Expressions [etext.lib.virginia.edu]

Maleville

9:41 am on Feb 8, 2004 (gmt 0)

10+ Year Member



Specials characters seems to be :

Text:
. Any single character
[chars] Character class: One of chars
[^chars] Character class: None of chars
text1¦text2 Alternative: text1 or text2

Quantifiers:
? 0 or 1 of the preceding text
* 0 or N of the preceding text (N > 0)
+ 1 or N of the preceding text (N > 1)

Grouping:
(text) Grouping of text
(either to set the borders of an alternative or
for making backreferences where the Nth group can
be used on the RHS of a RewriteRule with $N)

Anchors:
^ Start of line anchor
$ End of line anchor

Escaping:
\char escape that particular char
(for instance to specify the chars ".[]()" etc.)

Then, "_" or "-" are not part of them. May I write:
RewriteCond %{HTTP_USER_AGENT} .*efp@gmx\.net* [OR] rather then
RewriteCond %{HTTP_USER_AGENT} .*efp\@gmx\.net* [OR]
or
RewriteCond %{HTTP_USER_AGENT} .*Go!Zilla* [OR] rather then
RewriteCond %{HTTP_USER_AGENT} .*Go\!Zilla* [OR]
or
RewriteCond %{REQUEST_URI} ^/cgi(-local¦-bin)/FormMail [NC,OR] rather then
RewriteCond %{REQUEST_URI} ^/cgi(\-local¦\-bin)/FormMail [NC,OR]

balam

4:20 pm on Feb 8, 2004 (gmt 0)

10+ Year Member



> Then, "_" or "-" are not part of them.

You're right that "_" isn't, but "-" sometimes is, sometimes isn't - it depends on the context in which it is used. (Actually, that argument can be made - and is valid - for all the "special" characters.)

The "-" is used to denote a range of characters. Instead of typing "[0123456789]", you can type "[0-9]". "[a-zA-Z0-9]" matches (one of) 62 different characters: The alphabet in lower & uppercase and the numbers from 0 to 9.

As I said before, it's just a personal habit of mine to escape dashes when I mean dashes and not ranges. (It probably creates as much confusion for some as it reduces for others... :) In the future, I'll probably remove them when posting.)

> May I write:

The first version of each of your RewriteConds are fine, with a couple of small caveats...

> RewriteCond %{HTTP_USER_AGENT} .*efp@gmx\.net* [OR]
> RewriteCond %{HTTP_USER_AGENT} .*Go!Zilla* [OR]

Both of these end with an "*". This means, using the first one as an example, that you are looking for "zero or more of any character" followed by "efp@gmx.ne" followed by "zero or more of the letter 't'". This probably not what you want, and both cases you'd be better off to remove the final asterisk.

jdMorgan

1:01 am on Feb 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...and leading or trailing ".*" sequences are redundant, whether or not the pattern is anchored.

RewriteCond %{REQUEST_URI} something 
(without start or end anchor) means exactly the same thing as
RewriteCond %{REQUEST_URI} .*something.* 

- or -
RewriteCond %{REQUEST_URI} ^.*something.*$ 

Jim

Maleville

3:15 pm on Feb 11, 2004 (gmt 0)

10+ Year Member



Balam.
Thank's for this explanation. I didn't understood that * at the end match the last letter.

Maleville

3:23 pm on Feb 11, 2004 (gmt 0)

10+ Year Member



Jim.

Consider this:

Mozilla/4.0blablabla./Indy Library/v2.01/blablabla

Which is interesting to exclude is Indy Library, no matter the version. For a RewriteRule to 403, we have to find the exact name in a longer string.

In a RewriteCond, is there a difference between:

^Indy\ Library$

or
Indy\ Library

Is there an advantage to write it on a maner or on an other (Is one fastest in execution, more sure...?)

jdMorgan

6:26 am on Feb 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In a RewriteCond, is there a difference between:

^Indy\ Library$
or
Indy\ Library


Yes, a very big difference. A pattern of "^Indy\ Library$" matches only a request from a user-agent with the exact name "Indy Library" -- nothing more, and nothing less.

The pattern "Indy\ Library" matches any string containing "Indy Library", including the exact string "Indy Library".

^ and $ are "start" and "end anchors", respectively. See this regular expressions tutorial [etext.lib.virginia.edu].

Specifying exact, fully-anchored patterns makes regular-expressions processing much faster, but in cases like this, you don't have a choice. A good pattern might be:

 ^Mozilla/.*Indy\ Library/

Jim

Maleville

10:06 am on Feb 15, 2004 (gmt 0)

10+ Year Member



Sometimes I regret the good time of Sinclair's Basic Spectrum.

This part is very important for webmasters who find in their logs a USER_AGENT they don't want the visit.

A specific U_A has many versions it is not friendly to write in a list of RewriteCond.

More often the number and the lenght of a string containing "Indy Library" will slowdown the server. In consequence we need to extract the specifc U_A, no matter the version it has or what is its compatibility.

In ours logs it is easy to extract this agent of a long string and write, as you said, simply :
^Mozilla/.* Indy\ Library \L

But, before that you traped all unwanted U_A, you should want to prevent a visit from them. A lot of RewriteCond U_A lists can be found in WebmasterWorld. Unfortunatly all of them are not 100% fullproof because the webmaster doesn't know exactly the good syntax. And when he writes:
^Indy Library$

He ought to write:
^Mozilla/.* Indy\ Library/

With also a backslash in "Indy\ Library" . But imagine we found the name of this U_A on the net and we don't know exactly what are the caracters before and after or if it is standing alone without anything else. As we are interested to include it on our list before it visits our site, we must find the better way to recognize it.
If we put:
.* Indy\ Library perhaps we are right.
If we put
^Indy\ Library $ we are wrong and Apache will not recognize it.
The question is : how can we write correctly without knowing the full name of the U_A.

I follow you, Jim, and if I don't make mistakes we can take this USER_AGENT as exemple (it's not the real one) and try to find it in a RewriteCond. If the exact string "Indy Library" is:
Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET

If we write :
^Indy\ Library$ =we don't find it. (rest of the string is ignored)
^Indy\ Library =we don't find it (Indy Library-v.5 compatible I.E 5.5-NET)
Indy\ Library =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.*$ =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^Mozilla/.* Indy\ Library- =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET) but in an other version the "-" can be change in "/"
.*Indy\ Library* =we find it. But in this case there is no sence to look after zero or more of the letter "y" (Mozilla/v.01/ Indy Librar) and(y)

You are obliged to imagine and make you writing as large as possible to cath "Indy Library".
It seems that ^.*Indy\ Library.*$ is a correct way to do.

Any correction?

jdMorgan

8:14 pm on Feb 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



^Indy\ Library$ =we don't find it. (rest of the string is ignored)
^Indy\ Library =we don't find it (Indy Library-v.5 compatible I.E 5.5-NET)
Indy\ Library =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.*$ =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
.*Indy\ Library$ =we don't find it (Mozilla/v.01/ Indy Library)
.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^.*Indy\ Library.* =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET)
^Mozilla/.* Indy\ Library- =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET) but in an other version the "-" can be change in "/"

You are obliged to imagine and make you writing as large as possible to cath "Indy Library".
It seems that ^.*Indy\ Library.*$ is a correct way to do.

There is no difference between the regular-expressions

 Indy\ Library 

and
 ^.*Indy\ Library.*$ 

and I would suggest using the first version because it is shorter.

Corrections:

^Indy\ Library$ =we don't find it. (exact match required)
^Indy\ Library =we do find it ("Indy Library-v.5 compatible I.E 5.5-NET" starts with specified pattern)
Indy\ Library =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^.*Indy\ Library$ =we do find it ("Mozilla/v.01/ Indy Library" ends with specified pattern)
^.*Indy\ Library.* =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^.*Indy\ Library.*$ =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
.*Indy\ Library$ =we do find it ("Mozilla/v.01/ Indy Library" ends with specified pattern)
.*Indy\ Library.* =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^.*Indy\ Library.* =we find it ("Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET" contains specified pattern)
^Mozilla/.* Indy\ Library- =we find it (Mozilla/v.01/ Indy Library-v.5 compatible I.E 5.5-NET) but in an other version the "-" can be change in "/" (so leave off the "-")

Ref: [etext.lib.virginia.edu...]

Jim