Port 443 is the 'standard' secure port, where port 80 is the standard non-secure port.
* The only time I've personally seen it not on 443 is when someone runs their own box and changes it for some reason, but if it's not 443 for some reason, your host should have that listed in the docs or be able to tell you in about 3 seconds in a chat and changing the number in one condition is easy enough for me to not worry about changing the code for it. What the rulesets say starts at the left side of the rule and then goes through the conditions from top to bottom then to the right side of the rule.
# Then Read Here # If the rule (explained below) matches, this condition is checked
# It says 'If the server port is Not 443', check the next condition or
# continue to the 'instructions' on the right-side of the rule if there
# is no other condition below.
# - (!) at the beginning of a line indicates 'not'
RewriteCond %{SERVER_PORT} !^443$ # Finish With This # If the rule (explained below) matches AND the condition above match,
# this condition (the 2nd one down) is checked.
# It says 'If the original request made by the browser starts with cap A to cap Z*
# one or more times' AND is then followed by a 'space', Followed by a /,
# Followed by any character that's not a / one or more times Followed by
# a /, Followed by index.htm and any character that's not a 'space' 0 or
# more times, Followed by a 'space', Followed by HTTP and anything else
# check the next condition or continue to the 'instructions' on
# the right-side of the rule if there is no other condition below.
# * Caps A to Z are used so POST, GET, HEAD, PROPFIND, etc. all 'match'
# but improperly formatted requests, such as 'get' (lowercase) don't.
# \ is used to match a literal space in the request, because a space without
# the \ is a 'separating character', so if the \ is not used, the condition
# would 'break' after the A to Z match and that would generate a server error.
# The reason this one is important and matches Only original requests made
# by a bot or browser is if we matched All request, the internal ones that
# make it so the information from index.htm is shown for directory requests
# EG http://www.example.com/some-dir/ would also be redirected and that would
# just 'loop in a circle' and cause a server error, because it wouldn't be able
# to 'grab' the index.htm file and serve the info when someone visited the
# 'directory version' of the page, so we make sure we're only sending 'original,
# external requests' to the alternate location ending in a / rather than all
# requests for index.htm
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.htm[^\ ]*\ HTTP # Stare Reading Here # The rule (way below) is compared first and has two sides:
# The Left Side (before the space) is checked for a match.
# If it matches and there's one or more conditions, they are checked.
# If the left side of the rule and any/all conditions are met, the
# Right Side of the rule is applied.
# The Right Side of the rule contains the 'instructions' for what the server
# should do when the Left Side of the rule and any/all conditions are met.
# The Left Side of this Rule says:
# 'Match any character, except a / one or more times,
# followed by a / 0 or more times as a group, followed by index.htm at
# the end AND group/store anything up to index.htm as a back-reference
# AND also group/store anything up to /index.htm (including the /) as a
# back-reference.
# (Yes, it's just a 'confusing, long way' of saying 'match index.htm at the
# end of the requested location' and 'store all the other stuff up to it
# so we can use it later' if we want to.)
# The Right Side of this rule (applied when the left side matches
# and all conditions are met) says:
# Send the request to http://www.example.com/ followed by the first grouping
# of information we stored ($1), remove any query string (that's what the ? at the
# end of the URL does), send a 'permanent redirect message' to the browser or
# bot making the request [R=301], and stop processing the file [L]
# $ at the end of the Left Side or the end of a condition 'gives a definite end'
# if it's omitted, like in the conditions 'anything or nothing' after the last
# match is 'implicitly' counted as 'a match', so it's mostly good to use it
# unless you know why you don't want to ... In the conditions, I know enough
# has matched to 'verify what we need to' so there's no reason to match
# every single possible character to the end of the line, so 'implicitly matching'
# 'everything else' is fine to do and a bit more efficient than 'matching a bit more'
# just because we could.
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.example.com/$1? [R=301,L] ### # ###
The second ruleset says exactly the same thing with a couple of exceptions:
The condition of the port Not being 443 is removed, because the first rule already redirected any request ending in index.htm made that was not on Port 443 and then the file stopped processing, so we know if a request ending in index.htm made it this far it must be on Port 443.
Since we know it's on Port 443, we know it's an HTTPS request, so the right side of the rule is set to redirect exactly the same way as the first rule, except using https rather than http.
### # ###
These are the 'two together' being explained above.
# Redirect All Non-Port 443 Requests Containing index.htm
# Rule is first using a negative match for efficiency, assuming http is most used
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.htm[^\ ]*\ HTTP
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.example.com/$1? [R=301,L]
# Redirect Requests Containing index.htm on Port 443 to https URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.htm[^\ ]*\ HTTP
RewriteRule ^(([^/]+/)*)index\.htm$ [
example.com...] [R=301,L]
### # ###
One thing to keep in mind about mod_rewrite is: There's usually a 100 different ways to 'say the same thing' and some are more efficient, less confusing, or just preference...
This is more efficient than above because I did a 'single check' without storing the same thing multiple times and if index.htm at the end of the location requested is not matched I skipped over the next two rules, but even though using skips can be highly efficient, if you use many your attention to detail has to be much higher than even usual and 'just running through the rules', because if you miss count or insert a rule in the middle of a 'skip' you can really mess things up, so I don't usually recommend it even though I code most files that way personally.
So, briefly, what I did in this one is checked to see if the URL requested has index.htm at the end, and if not I skip the next two rules.
If it does, then I used .? to say if 'one thing or nothing' matches, because it's very efficient and I want to get to the condition as fast as I can since I already know the URL ends in index.htm if the rules weren't skipped.
Then they're basically the same, except on the Right Side of the Rule I used %1 rather than $1 to get the info needed to know what page to send someone to from the condition rather than getting it from the rule since I didn't bother to match and store it there.
RewriteRule !index\.htm$ - [S=2]
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.htm[^\ ]*\ H
RewriteRule .? http://www.example.com/%1? [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.htm[^\ ]*\ H
RewriteRule .? [
example.com...] [R=301,L]