Forum Moderators: phranque

Message Too Old, No Replies

Problem redirecting http to main https site

Some redirects work, some don't

         

dstiles

3:41 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looking at one site but the same on several. Some URLs work but others don't. I thought originally it might be a port 80 problem but looking closer some of that port's accesses were working.

working
https://www.example-site.co.uk (actual site)
https://example-site.co.uk (without www - redirects to actual site)
http://www.example-site.co.uk (www with http - redirects to actual site)

working - backup domain - same words but no hyphen
https://www.examplesite.co.uk (redirects but warns of bad cert - this is ok)
https://examplesite.co.uk (redirects but warns of bad cert - this is ok)

not working
http://example-site.co.uk (non-www http main site)

not working - backup domain
http://www.examplesite.co.uk
http://examplesite.co.uk

The site config for example[-]site is...

# hyphen-less http
<VirtualHost nnn.nn.nn.nnn:80>
ServerName www.examplesite.co.uk
DocumentRoot /srv/site
ServerAlias examplesite.co.uk
Redirect permanent / https://www.example-site.co.uk/
# the next bit added by letsencrypt
RewriteEngine on
RewriteCond %{SERVER_NAME} =www.examplesite.co.uk [OR]
RewriteCond %{SERVER_NAME} =examplesite.co.uk
RewriteRule ^ https://www.example-site.co.uk%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

# hyphenless https
<VirtualHost nnn.nn.nn.nnn:443>
ServerName www.examplesite.co.uk
DocumentRoot /srv/site
ServerAlias examplesite.co.uk
Redirect permanent / https://www.example-site.co.uk/
RewriteEngine on
RewriteCond %{SERVER_NAME} =www.examplesite.co.uk [OR]
RewriteCond %{SERVER_NAME} =examplesite.co.uk
RewriteRule ^ https://www.example-site.co.uk%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

# hypen (real domain) http
<VirtualHost nnn.nn.nn.nnn:80>
ServerName www.example-site.co.uk
DocumentRoot /srv/site
ServerAlias example-site.co.uk
Redirect permanent / https://www.example-views.co.uk/
RewriteEngine on
RewriteCond %{SERVER_NAME} =www.example-site.co.uk [OR]
RewriteCond %{SERVER_NAME} =example-site.co.uk
RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

# https - the web site
<VirtualHost nnn.nn.nn.nnn:443>
ServerAdmin alert@example.co.uk
DocumentRoot /srv/site
ServerName www.example-site.co.uk
ServerAlias example-site.co.uk
<Directory "/">
AllowOverride None
Require all denied
</Directory>
<Directory "/srv/site">
DirectoryIndex index.php
AllowOverride All
Include /etc/apache2/setenv.conf
Include /etc/apache2/rewrite.conf
</Directory>
CustomLog ${APACHE_LOG_DIR}/site/access.log combined env=!dontlog

SSLCertificateFile /etc/letsencrypt/live/www.example-site.co.uk/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/www.example-site.co.uk/privkey.pem
Include /etc/letsencrypt/options-ssl-apache.conf
</VirtualHost>

I have commented out and otherwise altered various of the above but only made it worse. I can find nothing in the include files that would cause a problem (they are concerned with bots of various hue).

Can anyone tell me where I've gone wrong, please?



[edited by: not2easy at 7:14 pm (utc) on Jul 9, 2019]
[edit reason] (hopefully) readability [/edit]

dstiles

3:49 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry, forgot URLs would be trashed...

[https] www[.]example-site[.]co[.]uk (actual site)
[https] example-site[.]co[.]uk (without www - redirects to actual site)
[http] www[.]example-site[.]co[.]uk (www with http - redirects to actual site)

working - backup domain - same words but no hyphen
[https] www[.]examplesite[.]co[.]uk (redirects but warns of bad cert - this is ok)
[https] examplesite[.]co[.]uk (redirects but warns of bad cert - this is ok)

not working
[http] example-site[.]co[.]uk (non-www http main site)

not working - backup domain
[http] www[.]examplesite[.]co[.]uk
[http] examplesite[.]co[.]uk

lucy24

5:58 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For future reference, it can be example.anything, including

http://example.co.uk/
https://example.co.uk/
(i.e. chained tld)

I like to say example.old and example.new when two names are needed.

And now to pore over the real question ... (The first thing that comes to mind is that - is a non-word character, so /b comes into play, but I can't imagine how or why that would be relevant here.)

not2easy

7:23 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It is best to avoid example-anything and use the .tld space as lucy24 suggested to differentiate if at all possible.

IF it must be done, a space between https: and // does the same thing. Sorry for any confusion my edits may have caused.

topr8

7:30 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



... as does wrapping everything inside [code] tags

lucy24

8:14 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



... as does hiding a bit of markup like [ b][/ b] in a key location so you get https://some-random-word.com as opposed to [some-random-word.com...]

But we digress.

Now then, OP: is it necessary to use {SERVER_NAME} in the RewriteCond? Most of the time, {HTTP_HOST} is what's called for. After all, these redirects are concerned with users asking for things in the wrong form; it doesn't matter what your server is called, only what they think the site is called.

Incidentally, what's the NE flag for? Most of the time you only need that if you're redirecting to a fragment. That can't apply here, since it wouldn't be included in the request. Do your URLs include weird characters? (URLpath only; query strings aren't affected.)

Are you absolutely certain it's necessary and appropriate to combine mod_alias (Redirect by that name) and mod_rewrite (RewriteRule) in the same scope?

fwiw, the most common format for a canonicalization redirect when you're dealing with both hostname and protocol is
RewriteCond %{HTTPS} !on [OR]
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) https://www.example.com/$1 [R=301,L]
I like to put in an exception for robots.txt, but that's optional.

The hostname part is often expressed as (www\.example\.com)? with optional hostname. But if everything is inside VirtualHost envelopes, there's no point in making it optional (whee! a savings of three bytes!) because requests will never get that far when no hostname is specified. Save it for the default VHost, if you've even got one.

:: quick detour to docs to double-check something ::

Yes, R=permanent is permitted. But why bother, when R=301 saves you six bytes?

phranque

10:37 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Redirect permanent / https://www.example-site.co.uk/
RewriteEngine on
RewriteCond %{SERVER_NAME} =www.examplesite.co.uk [OR]
RewriteCond %{SERVER_NAME} =examplesite.co.uk
RewriteRule ^ https://www.example-site.co.uk%{REQUEST_URI} [END,NE,R=permanent]

is the mod_alias directive doing anything that wouldn't be handled by the mod_rewrite directives?

i would get rid of the redundant Redirect directives as lucy24 suggested.

i would also get rid of the general hostname canonicalization rulesets from each of the <VirtualHost> containers and replace them with a single ruleset within the <Directory "/srv/site"> container, using something similar to what lucy24 suggested.
for example:
RewriteCond %{HTTPS} !on [OR]
RewriteCond %{HTTP_HOST} !^(www\.example-site\.co\.uk)?$ [NC]
RewriteRule (.*) https://www.example-site.co.uk/$1 [R=301,L]

phranque

10:46 pm on Jul 9, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



not working
http://example-site.co.uk (non-www http main site)

<VirtualHost nnn.nn.nn.nnn:80>
ServerName www.example-site.co.uk
DocumentRoot /srv/site
ServerAlias example-site.co.uk
Redirect permanent / https://www.example-views.co.uk/
RewriteEngine on
RewriteCond %{SERVER_NAME} =www.example-site.co.uk [OR]
RewriteCond %{SERVER_NAME} =example-site.co.uk
RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

Can anyone tell me where I've gone wrong, please?

example-site.co.uk is a valid ServerAlias, so if that hostname is requested it will also be the hostname of the redirect target as specified in the RewriteRule.

note that the mod_rewrite directives will fire first:
... when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file.

source: https://httpd.apache.org/docs/2.4/rewrite/avoid.html#redirect

dstiles

12:26 pm on Jul 11, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thank you all for the replies! Sorry not to have returned earlier.

Lucy:
> {SERVER_NAME} in the RewriteCond? Most of the time, {HTTP_HOST} is what's called for

The rewrite code was, as noted, written to the file by letsencrypt after my "Redirect permanent".

Recommendations vary and stackoverflow, for example, has log discussions on the subject; it's difficult to determine which should be used. It also seems that HTTP_HOST can include port identifiers if not the default (443 in this case), and I wonder if that could cause a problem. I have just tried HTTP_HOST on the "real" port 80/non-www url and it made no difference.

It's difficult to know what is being received because I cannot discover how to log the rejects; they do not appear in the site log nor the error log.

> what's the NE flag for?

No idea - as I asy, letsencrypt's code. I assume they know what they are doing?

? if you're redirecting to a fragment.

What do you mean by "fragment"?

> Do your URLs include weird characters?

Not unless the browser is messing them up.

> combine mod_alias (Redirect by that name) and mod_rewrite (RewriteRule) in the same scope?

Not sure at all. As I said elsewhere, I'm a relative novice at apache and php. But I've tried removing each section and there is no change.

> dealing with both hostname and protocol is
> RewriteCond %{HTTPS} !on [OR]
> RewriteCond %{HTTP_HOST} !^www\.example\.com$
> RewriteRule (.*) https://www.example.com/$1 [R=301,L]

Doesn't make a difference. Sorry. :(

The http www version still works after making the above changes.

phranque:

Understood but still not working.

lucy24

5:08 pm on Jul 11, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What do you mean by "fragment"?
An in-page element expressed by # in the URL, referring to an <id> or <a name> in the html. If a redirect target includes a fragment (as when you combine multiple pages into one) you need the [NE] flag to keep the # character from getting percent-encoded. This is by far the most common reason for the flag; in fact I think it's the only example given in the docs.

HTTP_HOST can include port identifiers if not the default
Yes, that's exactly why a canonicalization redirect includes both opening and closing anchors. Requests can of course be handled by whatever port suits your fancy--but it shouldn't be part of the request, like
http://example.com:8080/rest-of-path

It's difficult to know what is being received because I cannot discover how to log the rejects; they do not appear in the site log nor the error log.
Yes, this one's tricky. The only way I have ever figured out to log all details of redirected requests is to replace the R=301 with a rewrite to a quickie php script that first logs headers (including host and https) and then issues the redirect ... but this is obviously taking you into shooting-flies-with-an-elephant-rifle territory. But when a request receives a 301 response, it should definitely be showing up in the ordinary access logs, even if the said logs don't contain the information that would tell you why it was redirected.

Do your sites have separate access logs for http and https requests? This is common but not obligatory. Then, if a request shows up on the http side with a 301, you know that one reason for the redirect is that they used http instead of https--but you still don't know whether they concurrently gave the wrong www.

dstiles

11:10 am on Jul 12, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lucy - thanks for the reply.

> An in-page element expressed by # in the URL

Ah. Yes.

I can see normal failed accesses in the logs (eg bad bots) but the failing ones here do not show at all - either they are being clobbered before they get as far as the apache site configs or I have something in the configs the accesses do not like. Can't see redirects as such, just the successful results - probably need to adjust the log to show rewrites - but not sure any redirect is even being considered, and my elephant gun is in for repair. :)

I've just modified logging to record rewrites. For https and no www I get 18 lines defining the redirect. For http without www I get nothing at all, so it's being stopped too early to be logged. I'll have another look at the other configs; no luck last time but it's always worth another look.

By the way, current virtual_host config for port 80 is...

<VirtualHost nnn.nn.nn.nnn:80>
ServerName www.example-site.co.uk
DocumentRoot /srv/site
ServerAlias example-site.co.uk
RewriteEngine on
RewriteCond %(HTTPS) !on [OR]
RewriteCond %{HOST_HTTP} !^example-site.co.uk$
RewriteRule (.*) https://www.example-site.co.uk/$1 [R=301,L]
</VirtualHost>

[edited by: phranque at 10:51 pm (utc) on Jul 12, 2019]
[edit reason] readability [/edit]

dstiles

4:34 pm on Jul 12, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Found it!

There was a missing "listen 80" - I know I added it at some point but must have been in my confused period and it got removed again. :(

The basic puzzle which lat me forget this was that the real url with http worked! No idea why that was.

One last thing on this, for anyone else, is the SSL certificate. I ordered and applied one for www.domain only. With the basic redirect structure mentioned above, all URLs worked fine except for the "real" URL without the www, which asked for confirmation of an unexpected certificate. Obviously. Correct solution was to apply for a new cert containing both www and non-www but as an experiment I applied the following:

<VirtualHost nnn.nn.nn.nnn:443>
ServerName example-site.co.uk
DocumentRoot /srv/site
Redirect permanent / https://www.example-site.co.uk/
</VirtualHost>


and removed ServerAlias from the mail site.

All is now fine. Thanks again to all who posted here!

[edited by: phranque at 10:48 pm (utc) on Jul 12, 2019]
[edit reason] unlinked urls [/edit]

phranque

10:56 pm on Jul 12, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



By the way, current virtual_host config for port 80 is...

you probably want this instead:
<VirtualHost nnn.nn.nn.nnn:80>
Listen 80
ServerName www.example-site.co.uk
DocumentRoot /srv/site
ServerAlias example-site.co.uk
RewriteEngine on
RewriteRule (.*) https://www.example-site.co.uk/$1 [R=301,L]
</VirtualHost>

phranque

11:01 pm on Jul 12, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the first RewriteCond is redundant because you already know it's port 80 (non-https) and the second RewriteCond is unnecessary because you are always redirecting any/all hostnames to the canonical protocol and hostname.

lucy24

1:19 am on Jul 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



redundant because ... unnecessary because
That's an entertaining thought, not limited to RewriteConditions. Given an [OR]-delimited pair of conditions where the first condition will by definition always be met, meaning that the second condition isn't even evaluated, you end up with ... Whee!

phranque

2:24 am on Jul 13, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Given an [OR]-delimited pair of conditions where the first condition will by definition always be met,

true|foo = true

dstiles

12:36 pm on Jul 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



phranque:

> redundant/unnecessary

Thank you for stating the obvious, which I had missed entirely! :)

I have now modified the complete structure, which no longer uses the rewrite engine at all.


# hyphenless http
<VirtualHost nnn.nn.nn.nnn:80>
ServerName www[.]examplesite[.]co[.]uk
DocumentRoot /srv/site
ServerAlias examplesite[.]co[.]uk
Redirect permanent / https[://]www[.]example-site[.]co[.]uk/
</VirtualHost>

# hyphenless https
<VirtualHost nnn.nn.nn.nnn:443>
ServerName www[.]examplesite[.]co[.]uk
DocumentRoot /srv/site
ServerAlias examplesite[.]co[.]uk
Redirect permanent / https[://]www[.]example-site[.]co[.]uk/
</VirtualHost>

# hypen (real domain) http
<VirtualHost nnn.nn.nn.nnn:80>
ServerName www[.]example-site[.]co[.]uk
DocumentRoot /srv/site
ServerAlias example-site[.]co[.]uk
Redirect permanent / https[://]www[.]example-site[.]co[.]uk/
</VirtualHost>

<VirtualHost nnn.nn.nn.nnn:443>
ServerName example-site[.]co[.]uk
DocumentRoot /srv/site
Redirect permanent / https[://]www[.]example-site[.]co[.]uk/
</VirtualHost>

# https - the web site
<VirtualHost 185.35.151.121:443>
(etc)


Is there a rational/simpler way that combines port 80 and port 443 for the alternative domain. short of allowing all ports? If I allow all ports (:*) is that constrained by the Listen commands to 90 and 443?

phranque:
Apologies for not using the code feature before but I couldn't find it. Just found it in Preview.

dstiles

3:08 pm on Jul 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Splitting off the non-www for the real site as above works for the test domain. It does not work for others in that the browser again complains of inapproproate certificate.

One reason, I suspect, is that two domains share a certificate and browsers seem unable to differentiate. I have now generated separate certificates and added both www and non-www to each one. However, this only applies to two domains out of four, leaving one domain for which the error still happens. Mode certificate generation on the way. :(

I have re-combined the two 443 "real site" virtual_hosts and reinstated the SserverAlias.

lucy24

4:37 pm on Jul 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Psst, dstiles, if you could just bring yourself to say “example” instead of “example-site” you wouldn't have to do the extra jiggery-pokery:
ServerName www.example.co.uk
DocumentRoot /srv/site
ServerAlias example.co.uk
Redirect permanent / https://www.example.co.uk/

dstiles

6:08 pm on Jul 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



And, after following online discussions when one of the sites didn't redirect from non-www to www and failed drastically when I tried splitting the 443 block into two again, the following seems to be the answer.

Forget my last posting. Revert to the previous coding. Except: the 443 non-www block (shown as last before the main site definition) should be moved to AFTER the 443 www block.