Forum Moderators: phranque

Message Too Old, No Replies

Apache 2.0.63 and SetEnvIf directives failing?

         

geekdogfl

1:25 pm on Oct 2, 2010 (gmt 0)

10+ Year Member



Anyone know why SetEnvIf directives would no longer work in a pre-existing, formerly-perfectly-functioning, .htaccess file after upgrade to Apache 2.0.63? Using PHP 5.2.11.

My host says: "SetEnvIf Referer statements are having to be commented out, as the error logs show the server is unable to expand the regexp for these items."

They also think "it might be a difference between apache1 and apache2 in the handling for this directive,"but so far they are unable to nail down the syntax that apache2 seems to expect.

Thanks in advance.

sublime1

3:47 pm on Oct 2, 2010 (gmt 0)

10+ Year Member



After a little inspection, I can see no major differences in the way SetEnvIf works between 1.3, 2.0 and 2.2, based on the docs. Not sure about 2.0.63.

Here are the links to 1.3 [httpd.apache.org...]

2.0 [httpd.apache.org...]

2.2/current [httpd.apache.org...]

Also, there's nothing that pops out of the document written for upgrading from 1.3 to 2.0, but here it is: [httpd.apache.org...]

Note that the regular expression library was updated, so perhaps there a nuance there to check...

An example of the case would be very helpful.

One major change between 1.3 and 2.x is that the modules providing various functionality were significantly reorganized. We had a case (a long time back) where a module that we had uncommented in the 1.3 configuration just needed to be uncommented after we moved up to 2.x.

Also, you might want to ask your web host why they are upgrading to a version of Apache that is very old, 2.0.63 was a security release made back in 2008. 2.2 has been out for years and there's 2.3 in alpha now!

Hope this gets you started, if not, please provide an example.

Tom

jdMorgan

6:16 pm on Oct 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, consider posting a few of the problematic SetEnvIf directives here... It's hard to speculate on a coding problem with no code to review.

Jim

geekdogfl

10:45 pm on Oct 2, 2010 (gmt 0)

10+ Year Member



Hope this is what you mean. I don't know much about this stuff at all, obviously. Thx.

SetEnvIf Referer ^http://.*\alphaprawn\.com ban
SetEnvIf Referer ^http://.*\acaibuy\.net ban
SetEnvIf Referer ^http://.*\archiver\.co ban
SetEnvIf Referer ^http://.*\cruise-lines\.nl ban
SetEnvIf Referer ^http://.*\djshopedinburgh\.co.uk ban
SetEnvIf Referer ^http://.*\ezinearticlez\.com ban
SetEnvIf Referer ^http://.*\formyeyesonly.yuku\.com ban
SetEnvIf Referer ^http://.*\fusion-formulations\.com ban
SetEnvIf Referer ^http://.*\abcdef.blogspot\.com ban
SetEnvIf Referer ^http://.*\ghijkl\.mu.nu ban

sublime1

1:26 am on Oct 3, 2010 (gmt 0)

10+ Year Member



Hi geekdogfl --

In these examples, I think the regular expressions may be wrong, for example the first regex

^http://.*\alphaprawn\.com


has a backslash just before the a in alphaprawn -- the backslash is used to "escape" the next character so that it is not interpreted as a regular expression operator -- it's used correctly, for example before the period in \.com -- but I am not sure how it would be interpreted in this case.

Most likely, the correct regex is

^http://.*\.alphaprawn\.com
which would match [alphaprawn.com...] or http://example.alphaprawn.com and others.

It's possible that the change of regex libraries used between 1.3 and 2.0 that I mentioned earlier caused the matching to work differently.

But of course this is only part of the issue. Now that an environment variable is set, somewhere later it would to be used to do something (my guess, to ban these sites linking to yours). Are these sites really a current threat to yours? Are there others? How do you maintain the list?

Tom

jdMorgan

2:11 am on Oct 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sublime1's diagnosis is correct. Apache 2.x brought with it PCRE -- PERL-Compatible Regular Expressions. This replaces the POSIX regular expressions in Apache 1.x.

With PCRE, the "rules" change regarding regular expressions operators and escaping requirements.

In POSIX, if a character does not need to be esscaped because it has no meaning as a regex operator, then the escaping is ignored, and the "\" doesn't do anything.

In PCRE, if a character does not need to be escaped because it has no meaning as a regex operator, then the escaping activates another "level" of operators that are only available in PCRE. For example \d in PCRE means "match any digit."

However, to my knowledge, "\a" has no meaning to PCRE, so it is throwing an error.

Correcting your code to work with either POSIX or PCRE, and assuming your intent was to match these domains and all of their possible subdomains, you'd get:

SetEnvIf Referer ^http://([^./]+\.)*alphaprawn\.com ban
SetEnvIf Referer ^http://([^./]+\.)*acaibuy\.net ban
SetEnvIf Referer ^http://([^./]+\.)*archiver\.co ban
SetEnvIf Referer ^http://([^./]+\.)*cruise-lines\.nl ban
SetEnvIf Referer ^http://([^./]+\.)*djshopedinburgh\.co\.uk ban
SetEnvIf Referer ^http://([^./]+\.)*ezinearticlez\.com ban
SetEnvIf Referer ^http://([^./]+\.)*formyeyesonly\.yuku\.com ban
SetEnvIf Referer ^http://([^./]+\.)*fusion-formulations\.com ban
SetEnvIf Referer ^http://([^./]+\.)*abcdef\.blogspot\.com ban
SetEnvIf Referer ^http://([^./]+\.)*ghijkl\.mu\.nu ban

Note that the "/" is included in the negative-match alternate-character groups to help speed up parsing by limiting the number of matching passes to one plus the number of characters preceding the last period in the hostname. Without these, the number of matching attempts would depend on the length of the entire referrer string.

Jim

geekdogfl

10:38 am on Oct 3, 2010 (gmt 0)

10+ Year Member



Thank you all very much for this. I will forward to my host and let you know what happens.

PS: My intent with the code is to prevent referrals from those sites from ever reaching my site(s). In other words, those sites have links to my site on their sites, and when visitors to their sites -- I don't care WHO, click on the link to my site, I want them denied (403). I have my reasons. Hope this helps.

jdMorgan

6:02 pm on Oct 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The emphasis in my "if" clause in the post above was "assuming your intent was to match these domains and all of their possible subdomains."

That is, the regex has been modified for this purpose, and if that is not your exact purpose, then it may not be correct.

Jim

geekdogfl

10:25 am on Oct 4, 2010 (gmt 0)

10+ Year Member



My host found the prob and said (quote) the actual issue was with the placement of the escaping character. Original:

SetEnvIf Referer ^http://.*\adams\.net ban

The slash after the asterisk is the problem, as the asterisk needs to be escaped. Under Apache 1 (similar to php4, as it happens), some types of syntactical errors are simply ignored and the server moves on. Under the newer versions, that isn't the case, so my host changed them to this:

SetEnvIf Referer ^http://.\*adams\.net ban
... which works just fine.(end quote)

Thank you again.

sublime1

1:20 pm on Oct 4, 2010 (gmt 0)

10+ Year Member



geekdogfl --

I agree that the escaping was likely the original problem and that older regex engines might have ignored bad ones.

However, I do not believe the host's changed pattern can be correct, as it would only match a "http://x*adams.net" (where "x" is any single character, and the asterisk is an actual asterisk, not a wildcard). If I am correct, this is invalid, as I am pretty sure an asterisk is not a valid domain name character.

Jim's examples are very likely what you actually want (unless his assumption about your intent, which he has bolded, is not correct)

For example

^http://([^./]+\.)*adams\.net


is a pattern that "starts with
http://
, then one or more characters that are not "." or "/" followed by a "." (which can happen zero or more times), then "adams.net".

This would match:


http://www.adams.net
http://user1.adams.net
http://adams.net
http://foo.bar.adams.net

and so on, but not

http://johnadams.net


Tom

geekdogfl

1:31 pm on Oct 4, 2010 (gmt 0)

10+ Year Member



Hi, and again I thank you for taking the time to help.

In your example immediately above this reply, I am only wanting to block adams.net and www.adams.net. This is true for my entire list of "SetEnvIf" directives. I just didn't want referrals to my site from those websites. I wanted to give them all 403s.

What my host has done is not throwing errors but now I am confused as to whether it is blocking what I want. The host did not alter my list at all except for the code to make the errors stop. In other words, the host is assuming that I wanted to block as I described in my first paragraph (above, in this reply).

jdMorgan

1:52 pm on Oct 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The code I posted is correct for your requirements.

The code your host provided is incorrect.

Jim

geekdogfl

3:31 pm on Oct 4, 2010 (gmt 0)

10+ Year Member



"assuming your intent was to match these domains and all of their possible subdomains, you'd get:

SetEnvIf Referer ^http://([^./]+\.)*alphaprawn\.com ban
SetEnvIf Referer ^http://([^./]+\.)*acaibuy\.net ban
SetEnvIf Referer ^http://([^./]+\.)*archiver\.co ban
SetEnvIf Referer ^http://([^./]+\.)*cruise-lines\.nl ban
SetEnvIf Referer ^http://([^./]+\.)*djshopedinburgh\.co\.uk ban
SetEnvIf Referer ^http://([^./]+\.)*ezinearticlez\.com ban
SetEnvIf Referer ^http://([^./]+\.)*formyeyesonly\.yuku\.com ban
SetEnvIf Referer ^http://([^./]+\.)*fusion-formulations\.com ban
SetEnvIf Referer ^http://([^./]+\.)*abcdef\.blogspot\.com ban
SetEnvIf Referer ^http://([^./]+\.)*ghijkl\.mu\.nu ban


If my intent is NOT to match all possible subdomains? Sorry but I am totally confused now.

sublime1

7:49 pm on Oct 4, 2010 (gmt 0)

10+ Year Member



Hi geekdogfl --

I believe you state your requirement here:

I am only wanting to block adams.net and www.adams.net. This is true for my entire list of "SetEnvIf" directives. I just didn't want referrals to my site from those websites. I wanted to give them all 403s.


Jim's code will satisfy this requirement, with one slight detail I'll try to explain below. Your host's code will not satisfy your requirement because it has an error.

To be clear we use the term "domain" to refer to something like "example.com". The term "subdomain" would apply to "www.example.com". For most websites, the domain and "www" subdomain are effectively interchangeable. However, it is perfectly possible and common to have other subdomains, for example, blog.example.com. There can be multiple subdomains, too, such as tom.blog.example.com, etc.

So saying that the pattern will "match these domains and all of their possible subdomains" is precise language. If you are trying to ban "example.com" and "www.example.com" I assume that you would want to also ban other subdomains of the "example.com" domain, and that's what Jim's code does.

If this is not clear, or we have missed some important point, I urge you to state, as accurately and precisely as possible what your objective is. If you are able to explain why you want to do this, it may also help us understand.

Tom

geekdogfl

8:37 pm on Oct 4, 2010 (gmt 0)

10+ Year Member



Tom thank you, that is MUCH clearer to me. I will forward to appropriate persons.