Welcome to WebmasterWorld Guest from 54.147.16.12

Forum Moderators: phranque

Message Too Old, No Replies

http to https - Final Checklist

     
10:33 pm on Jan 23, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


Ready to convert a site from http to https. Would be grateful for a final pre-flight checklist as I want the change to be smooth!

The site is 15 year old information site, has 350K pages, subscriber based. Running apache on Centos. Not the real company names are listed below:

1. website is www.example.co.uk
2. Dedicated Server colocated at "ACME colo"
3. DNS is mainted by "XYZ domain register"

a) Is it easier to purchase the certificate from XYZ or a different supplier?

b) XYZ offer three types of certificates. £10, £50, £70, £250 per year. The £50 or £70 is the one to go for?

c) In httpd.conf I just need to:


RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}


Note that the conf file already contains other redirects such as:


RewriteCond %{REQUEST_URI} ^(/[^/]+/)/+(.*)$ [OR]
RewriteCond %{REQUEST_URI} ^(/)/+(.*)$
RewriteRule ^. http://www.example.co.uk%1%2 [R=301,L]

RewriteCond %{HTTP_HOST} ^mail\.example\.co.uk$ [NC]
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [R=301,L]


So do I leave that as-is, then add the http to https redirect after it? Or do I need to duplicate this section with another section for https?

d) What happens when subscribers first visit the site after the change over? Will there be an error or no change (apart from a redirect if they have bookmarked http://www.example.co.uk/ )

e) Can the change over be performed remotely? Is there a chance that the server could hang after a reboot?

f) Do I need to inform ACME colo that the site is changing to https? Would they need to enable something there to allow it?

g) Do I need to do anything in WMT? Google Analytics? Adsense?

h) Are there any issues with older browsers (XP / IE 6/7), iPhone 4?

i) Is it possible to setup https as a test only, leaving all users on http but a select few on https?

j) Connected with h) is it possible to back out of https if it goes wrong? Is there a point of no return?
10:56 pm on Jan 23, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15318
votes: 709


Addressing only the Apache aspects:

If the entire site is going to https, then change ALL existing redirects to give https in the target. The new http-to-https redirect goes after all other external redirects for this domain. If you like, you can combine it with your existing domain-name-canonicalization redirect, giving two possible [OR]-delimited conditions: "wrong form of hostname" OR "not https".

RewriteCond %{REQUEST_URI} ^(/[^/]+/)/+(.*)$ [OR]
RewriteCond %{REQUEST_URI} ^(/)/+(.*)$

Never put something in a Condition that can go in the body of the rule. This applies particularly to postive URL matches. But the details should probably go in the Apache subforum.

RewriteRule (.*) [%{HTTP_HOST}%{REQUEST_URI}<...]
Typo missing [R=301,L] flags? Target should be explicit
[example.co.uk...]
or
[example.co.uk...]
to intercept anyone who gave the wrong form of the hostname. And there's no point in capturing the request if you're just going to say %{REQUEST_URI} at the end. If you do capture, make sure to leave off unwanted directory names. That's assuming the rule is lying loose in config, not in a <Directory> section corresponding to the site root.
3:14 am on Jan 24, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11468
votes: 174


Target should be explicit
[example.co.uk...]
actually since that is in the httpd.conf file the slash is extraneous so the target should be:
[example.co.uk$1...]

RewriteCond %{HTTPS} off

i prefer checking %{SERVER_PORT} instead of %{HTTPS}.

Note that the conf file already contains other redirects such as:

RewriteCond %{REQUEST_URI} ^(/[^/]+/)/+(.*)$ [OR]
RewriteCond %{REQUEST_URI} ^(/)/+(.*)$
RewriteRule ^. http://www.example.co.uk%1%2 [R=301,L]

RewriteCond %{HTTP_HOST} ^mail\.example\.co.uk$ [NC]
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [R=301,L]



So do I leave that as-is, then add the http to https redirect after it? Or do I need to duplicate this section with another section for https?


i assume the purpose of that first ruleset is to strip extraneous consecutive slashes at the end of the directory path?
shouldn't this also redirect to https: protocol?
are any canonical urls non-https:?

you don't need the second ruleset if you replace it with the following, which should typically be the final ruleset among the external redirects:
# redirect secure pages to HTTPS:
# if requested with HTTP (i.e. server port is NOT 443)
# OR
# if requested with non-canonical hostname
# then 301 redirect to the canonical protocol and hostname

RewriteCond %{SERVER_PORT}!^443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.co\.uk)?$ [NC]
RewriteRule ^(.*)$ https://www.example.co.uk$1 [R=301,L]
3:42 am on Jan 24, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15318
votes: 709


since that is in the httpd.conf file the slash is extraneous

I'd pictured this rule lying loose in the config file, where a full capture would end up something like
/sites/username/example.co.uk/directory/subdir/filename

... which is why you need to start your capture at just the right place.

I wondered about the seemingly gratuitous capture in
^(/)

until I saw that it's a sneaky way of collapsing two rules into one. (What is it with double slashes anyway? Just lately I've seen a whole flurry of them-- not just from malign Ukrainians but inexplicably from the Googlebot itself.)
9:03 am on Feb 10, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


OK, I got this in motion.

The redirect I used in the end was a simpler one just in the vhosts section:


NameVirtualHost *:80
<VirtualHost *:80>
ServerName www.example.co.uk
Redirect permanent / https://www.example.co.uk/
</VirtualHost>


Seems to be working well at the moment. I think the search engines may have picked up a lot of duplicates (http and https of the same page) while I ran live tests before forcing to https. Hope that does not take too long to sort out.
8:38 pm on Feb 10, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11468
votes: 174


did you remove all the mod_rewrite directives?
you should not combine mod_rewrite and mod_alias directives within the same configuration.
11:07 am on Feb 11, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


Not using any alias directives.

I left everything else as they were (apart from changing references for http : // www . example to https: )

---

The httpd.conf may not be ideal. It is based on a collection of solutions gleaned from WebmasterWorld and other resources.
12:21 pm on Feb 11, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11468
votes: 174


Redirect permanent / https://www.example.co.uk/

this is a mod_alias directive.
12:56 pm on Feb 11, 2015 (gmt 0)

New User

joined:Sept 22, 2014
posts:15
votes: 0


You will need to do a site address change in GWT. Switching to https is considered a a completely new domain from search engines perspective.

As others said, everything needs to be 301 redirected just like you were moving to a new domain name.

Make sure all your images and js are configured in your SSL.conf. Also make sure you are not using any 3rd party js includes that don't support https. Otherwise users will get warning messages on every page load.

Good luck. :-)
2:11 pm on Feb 11, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


Phranque

Redirect permanent / https://www.example.co.uk/


That's the only one in there. It's in it's own <virtualhost> section. The rest of the rewrites are all in the global section of httpd.conf

If you are not supposed to mix them what are the consequences? No error is being displayed.

Snippet

In GWT I just added [example.co.uk...] to the Change of Address->Add new site section. I could not find any other setting to change http to https e.g. you would think there would be such a setting in Site Settings->Preferred Domain but there is only the option to choose the www or non www domain

As for 3rd parties the only issue I had was with google search api. I had to change that to link to their https: url as there was an issue with mixed content.

It's all working fine for 48 hours now. There are no error messages in the logs, there are no user complaints (although if they can't see the site they can't complain!)

One point about the ssl configuration. I ran the Qualys SSL Labs test to ensure the server was configured correctly. I actioned the suggestions to improve the grade (by compensating for Poodle, and other security flaws with SSL). Doing so I achieved an A+. But looking at other sites in my sector - including those with £million budgets they are B, C and as low as F. So have I made the site so secure that it's going to block a lot of browsers?
3:37 pm on Feb 11, 2015 (gmt 0)

New User

joined:Jan 18, 2015
posts: 25
votes: 4


> I achieved an A+. ... have I made the site so secure that it's going to block a lot of browsers?

I had an A, which I believe I backed off to a lower grade, primarily by supporting SSL 23 (both 2 and 3). The problem solved, IIRC, was either IE under XP or XP period.
6:02 pm on Feb 11, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member wheel is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Feb 11, 2003
posts: 5072
votes: 12


In terms of SSL certs, my opinion is that there's really only two choices. 1) The cheapest, if you just want SSL certification/encryption or 2) EV SSL if you're going for the cadillac version. I don't see a lot of differences between the various ssl cert flavors.

One other change to consider, move any backlinks you have control of over to https as well - email bloggers/backlinks etc. Not sure it makes any difference, just good practice.
9:57 pm on Feb 11, 2015 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Apr 29, 2005
posts:2093
votes: 110


OMG!

Who was it that said converting to https was a couple of lines of code and hey presto it's done?

If I have to go through all the stuff on this post I'm going to pay someone! But I'll wait a few months to see if it is necessary, it certainly seems a waste of time for my informational sites.

Very useful post though, it's in my bookmarks.
10:30 pm on Feb 11, 2015 (gmt 0)

New User

joined:Jan 18, 2015
posts: 25
votes: 4


nomis5, there is a whole level of pain involving acquing certificate not even discussed here because it appears OP is past that. (1) I have seen at least one name brand cert cover www.example.com and NOT cover example.com. The cheapest cert I could find, covers both. Maybe the name brand expects you to buy a (much) more expensive wildcard. (2) You need to have a dedicated IP unless you can implement SNI and then say goodby to all XP visitors. (3) Some have reported onerous documentation requirements for more expensive certs. Again, the very cheapest is domain validated and documentation basically consists of resonding to an email tied to your domain. (4) The process of buying a certificate involves strange formats and file structures.
10:37 pm on Feb 11, 2015 (gmt 0)

New User

joined:Jan 18, 2015
posts: 25
votes: 4


followup: I have never seen (1) discussed on any sales page. If you go ahead and buy without asking, you only find out if both "" and "www." are covered after you make purchase.
11:45 pm on Feb 11, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15318
votes: 709


If you are not supposed to mix them what are the consequences? No error is being displayed.

No server error, just possible Unintended Consequences.

The problem is that you generally can't control the execution order of mod_rewrite and mod_alias. By default, mod_rewrite comes first; you could reverse them, but then you'd get a fresh set of problems. So an internal rewrite created by mod_rewrite risks encountering mod_alias and being sent back into the world as an external redirect, because mod_alias doesn't support conditions like %{THE_REQUEST}.

For safety's sake, don't use both mod_alias and mod_rewrite on the same path. That is, of course, for redirects and rewrites; mod_alias also does other jobs which won't conflict with mod_rewrite.

you only find out if both "" and "www." are covered after you make purchase

In practice, how does this affect site behavior? Nobody will ever use the "wrong" form of your name except robots and the odd type-in, and you'd be redirecting those anyway.
11:57 pm on Feb 11, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11468
votes: 174


If you are not supposed to mix them what are the consequences?


http://httpd.apache.org/docs/current/rewrite/avoid.html [httpd.apache.org]:
when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file
12:16 am on Feb 12, 2015 (gmt 0)

New User

joined:Jan 18, 2015
posts: 25
votes: 4


lucy24, this is my conjecture because I don't see a definitive discussion elsewhere. I think that .htaccess is applied *after* SSL negotiation. The "" and "www." domain name needs to authenticate before the http request payload is decrypted. That means unless the cert handles both, you lose the "odd type-in" traffic as you say. You lose me because I rarely specifiy "www." when typing a URL.
12:26 am on Feb 12, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11468
votes: 174


Nobody will ever use the "wrong" form of your name except robots and the odd type-in, and you'd be redirecting those anyway.

the secure connection must be negotiated before the request is sent to the server.
if the connection is rejected for the requested (non-canonical) hostname due to secure certificate issues, there is no way for the server to respond with a canonical hostname redirect.
4:42 am on Feb 12, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15318
votes: 709


the secure connection must be negotiated before the request is sent to the server

phranque, you will probably need to keep telling me this every month or so until it sinks in, because I know you have explained it before :(
10:32 am on Feb 12, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


re: mixing mod_alias and mod_rewrite. I just read that apache doc and that seems to conflict itself.

"A common use for RewriteRule is to redirect an entire class of URLs. For example, all URLs in the /one directory must be redirected to http://one.example.com/, or perhaps all http requests must be redirected to https.

These situations are better handled by the Redirect directive. Remember that Redirect preserves path information. That is to say, a redirect for a URL /one will also redirect all URLs under that, such as /one/two.html and /one/three/four.html."


So their advice is that the redirect is much better than the rewrite.

But later it does say:

"The use of RewriteRule to perform this task may be appropriate if there are other RewriteRule directives in the same scope. This is because, when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file."

I do not see this as a no-no. It is just information that any conf that has rewrite and redirect the rewrite will run first. Not a danger, not a warning just info.

I mean who does not have any rewrite in their conf? Every site probably has one. By doing so you are then 'not allowed' to use any kind of redirect?

As I said the site is working fine and the various rewrites I have in here are working fine.

Redirect 301 /articles/old_article_name.html [example.co.uk...]

If I open the page http://www.example.co.uk/articles/old_article_name.html it redirects perfectly to [example.co.uk...]

If I open the page [example.co.uk...] it redirects perfectly to [example.co.uk...]

So why is this wrong?
10:47 am on Feb 12, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


IPv6 - netstat
----

One curious issue is that netstat is now showing :ffff: foreign addresses. These are "IPv6 prefix for an IPv4 address mapped into IPv6 space".

I do not know why these are appearing. In the network config I have always had IPv6 disabled. But since the change to SSL most connections in netstat are now listed as :ffff:aaa.bbb.ccc.ddd

This is not a problem as such but some of my scripts, which monitor connection counts do not work because they are trying to grep netstat pure IPv4 numbers
11:09 am on Feb 12, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


Going back to the opening points I can now answer some of the questions I asked.

a) Is it easier to purchase the certificate from XYZ or a different supplier?
I looked at various. Most were ridiculously expensive security outfits (Symantec), some were resellers of well known names (Comodo). In the end I chose a reseller who I have already worked with for 15 years.

b) XYZ offer three types of certificates. £10, £50, £70, £250 per year. The £50 or £70 is the one to go for?
After reading various bits of advice I went for the £50 option.

c) In httpd.conf I just need to ...
I went for the simplest solution - the redirect. So far ther are no issues but see the points raised by phranque and lucy24 above.

d) What happens when subscribers first visit the site after the change over? Will there be an error or no change (apart from a redirect if they have bookmarked http://www.example.co.uk/ )
This was totally seamless. Some had not even noticed they were now connecting via https. I had expected the worst (certificate error "GET ME OUT OF HERE" yellow warnings) if I got it wrong but this just worked straight out of the box.

e) Can the change over be performed remotely? Is there a chance that the server could hang after a reboot?
Yes I did this remotely. SSHd into the server, created the certificate keys, certs, edited httpd.conf, restarted apache and it worked immediately.

f) Do I need to inform ACME colo that the site is changing to https? Would they need to enable something there to allow it?
I asked them and it was fine. They were not blocking 443.

g) Do I need to do anything in GWT? Google Analytics? Adsense?
Reading various resources there is not much to do with this. You just 'add a new site' in GWT the [example.co.uk...]

h) Are there any issues with older browsers (XP / IE 6/7), iPhone 4?
I tuned the setting to get an A+. I believe this will lock out older browsers (certainly the XP IE 6/7 ... possibly 8) But the site menu system and various pages (bootstrap, jquery) do not work with those older browsers anyway so those users are already restricted in someway. Besides, most of the IE6 traffic are rogue bots and scrapers so they can whistle.

i) Is it possible to setup https as a test only, leaving all users on http but a select few on https?
Yes. This works perfectly fine. If you do not issue the redirect from http to https you can run both protocols at the same time. One is on port 80, the other port 443.

j) Connected with h) is it possible to back out of https if it goes wrong? Is there a point of no return?
Luckily I did not have to do this. I guess it would be easy enough. The creating of the certificats does nothing to the system so by removing the SSL config in apache you can revert back to http.
9:34 pm on Feb 12, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11468
votes: 174


with your current configuration if you request http://example.co.uk/articles/old_article_name.html you will get a chained redirect.
10:43 pm on Feb 12, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


phranque, this is what happen when I requested http://example.co.uk/articles/old_article_name.html


nnn.nnn.nnn.nnn - - [12/Feb/2015:22:48:59 +0000] "GET /articles/old_article_name.html HTTP/1.1" 301 263 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" 0 example.co.uk "-" "-"
nnn.nnn.nnn.nnn - - [12/Feb/2015:22:48:59 +0000] "GET /articles/old_article_name.html HTTP/1.1" 301 259 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" 0 www.example.co.uk "-" "-"
nnn.nnn.nnn.nnn - - [12/Feb/2015:22:48:59 +0000] "GET /blog/old_article_name.html HTTP/1.1" 301 259 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" 0 www.example.co.uk "-" "-"
nnn.nnn.nnn.nnn - - [12/Feb/2015:22:48:59 +0000] "GET /blog/nicer-article-name.html HTTP/1.1" 200 7711 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" 0 www.example.co.uk "-" "-"

5:57 am on Feb 13, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15318
votes: 709


Uh-oh, that looks like:
#1 without-www redirected to with-www, and/or http redirected to https (neither of these would be apparent in logs, but I have to assume that's what the first 301 means)
#2 /articles/ redirected to /blog/
#3 old filename redirected to new filename

If that's correct, your rules are in precisely backward order. Remember the lecture about ordering rules from most specific to most general? OK, maybe you don't, it's a staple of the Apache subforum.

The overall sequence of rules should be like this (simplified for illustration):

#1 redirect any specific names that will be changing
RewriteRule (blog|articles)/old_article_name.html https://www.example.co.uk/blog/nicer-article-name.html [R=301,L]

#2 redirect anything else that used to be in articles and will now be in blog
# with preceding RewriteCond if only selected names are moving
RewriteRule articles/(capture-stuff-here) https://www.example.co.uk/blog/$1 [R=301,L]

#3 mop-up redirect only for requests that have not been handled earlier
RewriteCond %{SERVER_PORT}!^443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.co\.uk)?$ [NC]
RewriteRule /first-part-of-path/(.*)$ https://www.example.co.uk/$1 [R=301,L]


Remember, again, that mod_rewrite executes before mod_alias. Since anything with a RewriteCond can only be done in mod_rewrite, this in turn means that the earlier, more specific redirects must use mod_rewrite even if their content would otherwise allow mod_alias (Redirect by that name).


Question: Do there exist browsers, or IPs, or whatever the variable is, that can handle https pages but do not send a hostname with their request? If no, then the parentheses-and-question-mark element of the domain-name-canonicalization is no longer needed because the "or nothing" option is gone.
6:35 am on Feb 16, 2015 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 17, 2009
posts:107
votes: 0



g) Do I need to do anything in WMT? Google Analytics? Adsense?


You need to add your HTTPS://www.yourdomain.com/ URL into WMT as a new profile that can start taking over the tracking.

Otherwise, what you will notice is a slump in search queries on the HTTP version whilst your onsite analytics isnt showing any declines.
10:38 am on Feb 16, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1187
votes: 6


Lucy24

I am loathed to touch it at the moment. As it is all going well (in that users are not being locked out, the google god has not been upset) I would like to give this a few more days before changing anything.

There are other things in the conf file which may or may not be ideal. It's based on an original file I have had for years. It has been adapted for optimisation and security. Rewrites were added (some which may no longer be needed). I would post it up in it's entirety but it's probably too big!

-----------------------

EvilSaint

OK, I created a new profile. So I now have two profiles?

http: example
https: example

The http site has current information (5,800 searches, 90 crawl errors, 390,000 pages submitted ...

The https site data is totally blank.

I guess this needs time to change over? The http site will gradually revert to zero whilst the https site builds up?

--

As a side notification: Before I did this change today the crawl rate changed from 10,000 pages a day to 60,000 pages a day.