A think piece prompted by a recent offline conversation:
Consider the typical canonicalization redirect on an HTTPS site without subdomains:
RewriteCond %{HTTPS} !on [OR]
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) https://www.example.com/$1 [R=301,L]
or possibly
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC]
RewriteRule ^ https://www.example.com%{REQUEST_URI} [R=301,L]
Anyone remember
Flowers for Algernon? (I think the movie adaptation was called
Charly.) Early in the book, there's a scene where the main character's bakery co-workers are trying to help him out by showing him how to make rolls, so he can move up from being a janitor. He comes away hopelessly confused because no two men have identical procedures, and he doesn't have the mental equipment to figure out which parts are essential and which are a matter of personal preference.
Looking at this simplest of all redirects, I see a minimum of four variables. (16 possibilities, all perfectly valid for the job at hand.)
#1 You can say
off or you can say
!on #2 You can make the hostname optional
#3 You can use the [NC] flag with the hostname
#4 You can capture the request,
or you can use %{REQUEST_URI}
What's the poor user to do?
#1 Given a binary toggle, there shouldn't be any alternative to “off” or “on”. (Servers don't have soft bits, do they?) But can it hurt to cover all possibilities?
#2 This may be the easiest choice: in shared hosting, or any situation using a <VirtualHost> envelope where the site in question isn't the catchall, there will never be a request with no hostname.
#3 Human browsers currently flatten hostnames when sending in a request: You can type “ExAmple.com” or “examPLE.com” but your browser will send in a request for “example.com”. Show me a hostname with capitals, and I’ll show you a malign robot. (But I won't be able to show you many: less than 1/10 of 1% of all requests use capitals in the hostname.)
#4 Has anyone ever benchmark-tested the choice between the two options, capture and REQUEST_URI? One way, the server has to capture something which will, 99% of the time, end up not being used. Another way, the server has to ask itself “What was it they were looking for?” (or does it even need to ask?) How enormous would a site need to be before the difference in speed or CPU becomes significant?
Food for thought.