Forum Moderators: phranque

Message Too Old, No Replies

Force https: and www for all pages, force / for index page

Use .htaccess to force https: and www for all pages, force / for index page

         

SueF

9:41 am on Jan 30, 2016 (gmt 0)

10+ Year Member



I'm a complete novice, so please bear with me if my terminology is incorrect. Yesterday, I purchased a SSL and upgraded my site to https. I've been trying to change my .htaccess file to do 3 things:

1. For all pages, redirect everything to https://
2. For all pages, redirect non www to www
3. For the index page, redirect /index.html to /

The first two steps seem to be working okay. But somehow, I've created an endless loop of / redirecting to itself and I haven't figured out how to get index.html to redirect to /.

I've looked through recent postings and have found answers on how to accomplish one or two of the requirements, but not all three. And, I haven't found mention of the endless loop problem that I've created for myself.

Words of advice greatly appreciated.



Options +FollowSymlinks

RewriteEngine On

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]

ErrorDocument 404 https://www.example.com/custom404.html

# Permanent URL redirects
Redirect 301 /about.html https://www.example.com

not2easy

2:03 pm on Jan 30, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The first two rules can be combined into one. I don't see any rule for index.html though. The error document should never include the full URL, just the page name.

You should not redirect about.html to your home page. If it does not exist it should return a 404 error, unless you changed the name of "about.html" to "index.html" or if it used to exist but it is now shown at "index.html". Note that this is not the same kind of rewrite as your previous rewrites. This uses mod_alias and you need to be careful of using mod_alias in combination with mod_rewrite which is what your other rewrites are using. It can give you unexpected results if used in the wrong order. If you need to use a mod_alias redirect, put it before the https/www rewrites which should be the last rewrite in your htaccess. Having it higher in the order of rules can cause loops as the same URL request gets re-processed.

SueF

4:59 pm on Jan 30, 2016 (gmt 0)

10+ Year Member



Thank you for the easy-to-understand answer! A few questions before I make the changes.

Does the mod_alias redirect go before or after the "RewriteEngine On" line?
The content of the new index page is almost identical to the content of the old about.html page and I don't want to lose the power from external website links pointing to the old about.html page. Is there a better way to redirect for this purpose?
I found some code to redirect the index.html page to / and it seems to be working.

RewriteRule ^index\.html$ / [R=301,L]
RewriteRule ^(.*)/index\.html$ /$1/ [R=301,L]


I assume it can be the third rule in the series. Is there a recommended order for the three rules?
How do I combine them?

(Sorry, I wasn't exaggerating when I said I was a complete novice.)

Thank you again for your help.

not2easy

6:08 pm on Jan 30, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Combining the two rules:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [OR]
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://www.example.com/$1 [R=301,L]

To understand clearly what these rules are doing it helps to know that '!' at the start means 'not' and the '^' caret means 'start' so the condition is "If it does not start with". At the top of this thread there is a Button for Forum Options, that dropdown has links to this forum's Charter and Library - great resources to help the novice learn a lot more about how htaccess works and the order of things.

The two different types of rewrite (mod_alias and mod_rewrite) can conflict, that is the main reason to try to avoid using them in combination if possible. Each "mod" or module in Apache works in a predefined order, some may differ from one host to another, but you can be pretty sure that mod_rewrite will process before mod_alias. So you should not be using a mod_alias rewrite that needs to pass through another rewrite in mod_rewrite. Be sure that the mod_alias rewrite if you have to use it, is after the https/www mod_rewrite rule.

If you wait a bit, an actual authority type person may come through with better details, I'm using notes to remind me of things. Apache details and regex aren't my forte.

lucy24

10:28 pm on Jan 30, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As noted above, do not combine mod_alias and mod_rewrite for redirects; they don't play nice together. Your server won't break, but things won't happen in the right order. Once you've got anything using mod_rewrite, you'll need to change any existing mod_alias rules (Redirect or RedirectMatch) to use mod_rewrite instead, and then make sure everything is in specific-to-general order.

Your index.html redirect needs an extra flag to avoid an infinite loop:
RewriteRule ^(([^/]+/)*)index\.htm http://www.example.com/$1 [R=301,L,NS]
See the [NS] flag? That's for "no subrequest", meaning "do not invoke this rule-- and don't evaluate Conditions-- when there has been an internal request for the file 'index.html'" (invoked by the DirectoryIndex directive, which is the whole reason the / form works at all).

An alternative format for the index redirect is
RewriteCond %{REQUEST_URI} ^/(([^/]+/)*)index\.htm
RewriteRule index\.htm http://www.example.com/%1 [R=301,L,NS]
This two-steps-forward, one-step-back approach means that the server doesn't need to do any capturing except in the rare case where the request actually involves "index.html" and then only when it wasn't an internal subrequest. It may be a teensy weensy bit more efficient, since it's a relatively complicated capture with a bit of back-and-forthing.

Note that I've said "index\.htm" without closing anchor. This achieves two extra things: It covers requests for "index.htm" (rare, but can't hurt) and also requests with extra garbage in the path after ".html" (rare etcetera).

If your URLs never contain literal periods-- extensions and sitename don't count-- then replace [^/] with [^./] to get the server out of there a wee bit faster.