Forum Moderators: phranque

Message Too Old, No Replies

More fun with [E], forcing first character to uppercase

         

csdude55

7:57 pm on Feb 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a rule that looks like this:

RewriteCond %{HTTP_HOST} ^(?:(?:www|ww2)\.)?(\w+)\. [NC]
RewriteRule ^ - [E=domain:%1]


I'm not 100% sure of this, but I THINK that it's always rewritten to lowercase by default. I don't have anything in my .conf files to force this myself, but I know that if I go to www.MyDomain.com (any of the domains on my server) then I'm automatically rewritten to www.mydomain.com.

But now I'd like to write a second variable that's identical to domain, but with the first character uppercase.

I know that I could do it manually with 26 rules, but is there a better way?

RewriteCond %{HTTP_HOST} ^(?:(?:www|ww2)\.)?(\w+)\. [NC]
RewriteRule ^ - [E=domain:%1]

RewriteCond %{ENV:domain} ^a(\w+)$
RewriteRule ^ - [E=ucDomain:A%1]

RewriteCond %{ENV:domain} ^b(\w+)$
RewriteRule ^ - [E=ucDomain:B%1]

...

RewriteCond %{ENV:domain} ^z(\w+)$
RewriteRule ^ - [E=ucDomain:Z%1]

w3dk

8:28 pm on Feb 9, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



The HTTP_HOST server variable holds the value of the "Host" HTTP request header. Whether this is all lowercase is entirely dependent on the user-agent making the request. The server is expected to treat this case-insensitively. All(?) browsers do lowercase the hostname when making the HTTP request, however, some bots may not. It's possible you could receive requests where the "Host" header contains uppercase characters.

Since these directives are in the server config (I assume) then you could use the built-in "toupper" RewriteMap function, to convert the first character to uppercase (assuming the remainder is already lowercase).

Note that domain names can contain a hyphen, but "\w" (in your regex) excludes this. (?)

For example:


RewriteMap uc int:toupper
RewriteCond %{ENV:domain} ^(\w)([\w-]+)$
RewriteRule ^ - [E=ucDomain:${uc:%1}%2]


However, I would do this in your application logic/script, not Apache.

[edited by: w3dk at 8:48 pm (utc) on Feb 9, 2021]

w3dk

8:44 pm on Feb 9, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



It's possible you could receive requests where the "Host" header contains uppercase characters.


In which case, you could either block it (since it's probably just a careless bot) or "fix" it:


RewriteMap lc int:tolower
RewriteCond %{ENV:domain} [A-Z]
RewriteRule ^ - [E=domain:${lc:%{ENV:domain}}]

csdude55

8:57 pm on Feb 9, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm really not too concerned with bots getting the right variable here, unless I can safely assume that it's a bad bot that's eating up resources unnecessarily. If that's the case then I can easily block them, so let me know if you think that's a good idea!

And thanks for pointing that out about \w! None of my current domains contain anything but letters so I should probably use [a-z], but I guess it would be more versatile in the long run to use [\w-].

Correct me if I'm wrong, but I HAVE to add RewriteMap to the httpd.conf file, right? Not to /etc/apache2/conf.d/userdata/foo.conf ? I've been hesitant to mess with that since cPanel tends to overwrite it when it updates.

However, I would do this in your application logic/script, not Apache.

I currently do it in both PHP and Perl separately, but since I already define domain in Apache I thought it might be better to move this, too. It's just a one-liner so don't think it will make any relevant difference either way, really, but I'm making a point to learn more about Apache just for the sake of my own education and keeping things interesting :-)

phranque

10:53 pm on Feb 9, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I know that I could do it manually with 26 rules, but is there a better way?

no.
(at least not within .htaccess)
I'm making a point to learn more about Apache just for the sake of my own education and keeping things interesting

you should spend some time on the lowercasing algorithm in jdmorgan's 2nd post in this thread:
[webmasterworld.com...]

w3dk

3:13 am on Feb 10, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



I'm really not too concerned with bots getting the right variable here


If you are referring to the "case" of the requested Host header, then it's really down to whether your site behaves as expected and doesn't break when a "malformed" Host header is requested. Lookups in your server-side script are likely to be case-sensitive.

Correct me if I'm wrong, but I HAVE to add RewriteMap to the httpd.conf file, right? Not to /etc/apache2/conf.d/userdata/foo.conf ? I've been hesitant to mess with that since cPanel tends to overwrite it when it updates.


The `RewriteMap` directive needs to be used in a server or virtualhost context. In other words, not a directory or .htaccess context. If the file ends in ".conf" then you are already in the right ballpark. I thought all your recent posts are about moving directives from .htaccess to the server config?

However, if you do need to do this in .htaccess then you can use an Apache Expression (Apache 2.4+) - which also avoids the "26 rules" and also works in the server config.

For example, uppercase first letter and lowercase (guaranteed) for the remainder:


RewriteCond expr "toupper(%{ENV:domain}).'@'.tolower(%{ENV:domain}) =~ /(.)[^@]+@.(.*)/"
RewriteRule ^ - [E=ucDomain:%1%2]

csdude55

5:45 am on Feb 10, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the file ends in ".conf" then you are already in the right ballpark. I thought all your recent posts are about moving directives from .htaccess to the server config?

You're right, and I was confused since the docs specifically say it HAS to be in httpd.conf, but doesn't refer to the .conf files that I have placed in /etc/apache2/conf.d/userdata/. Since testing there requires a reboot of Apache, I have to wait until 1am or so to test anything :-/

But I just now tested, and you're right, it works there just fine :-)

BTW, I had NEVER seen the expr command! That's awesome, thanks for that tip!

you should spend some time on the lowercasing algorithm in jdmorgan's 2nd post in this thread:

@phranque, thanks for the link, it's definitely helpful!

lucy24

4:50 pm on Feb 10, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the docs specifically say it HAS to be in httpd.conf
The quirk is that a RewriteMap can only be declared in config (including vhost), but once it's been declared, it can be used in htaccess. This can be useful for testing purposes.