Welcome to WebmasterWorld Guest from 54.158.86.243

Forum Moderators: Robert Charlton & andy langton & goodroi

Will installing SSL certificate cause duplicate content penalty with http & https?

     
1:44 am on Mar 5, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


I'm running a VPS server and for so many years, I've been using http:// for my main website. In fact, to even avoid www and non-www issue, duplicate content issues, etc. in Google search engine, I even have the below code in my .htaccess currently:

-----------
RewriteEngine On
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteCond %{REQUEST_URI} !^/[0-9]+\..+\.cpaneldcv$
RewriteCond %{REQUEST_URI} !^/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
-----------

Aside from the above, I use absolute path showing http:// for internal links to deter content scrapers.

Now, recently, due to an update in WHM/cPanel, and because Comodo is now issuing free SSL certificates, and AutoSSL is enabled by default, I noticed that I can now access the https:// version of my site. That is, if I type https:// manually, the browser will state it is secure. For one, that is a good thing, since I'm planning to migrate to https:// in the future and I believe my site is ready for that I believe. But I don't plan to do the migration now, since that would be a very time consuming process.

Given my conditions above, if I just leave AutoSSL enabled by default and the free Cpanel certificate installed in my domain name, will this cause duplicate content issues for many search engines especially Google? Because I'm not very sure if major search engines are smart enough to think that although the https:// version of my site is apparently accessible, my links are still all http:// . What is your suggestion?

[edited by: phranque at 8:55 am (utc) on Mar 6, 2017]
[edit reason] exemplified "mydomain" [/edit]

6:50 pm on Mar 5, 2017 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Nov 13, 2016
posts: 348
votes: 49


I think Google has some latitude of tolerance with duplicate content when it's only the protocol which is different. However, it will start replacing your HTTPS page in the SERP. And you should make things clean, as much as possible, by setting 301 redirection from the HTTP to the HTTPS pages, or, set canonical links, etc...
11:57 pm on Mar 5, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


Thanks. The site is a static information site with over 500 pages (all are in .htm format). Does this mean that we would need to create over 500 entries of 301 redirect in .htaccess to retain search engine ranking? Or is there a simple bulk method?
12:08 am on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


Does this mean that we would need to create over 500 entries of 301 redirect in .htaccess to retain search engine ranking? Or is there a simple bulk method?

Yes, a simple "bulk" method:

RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

This will redirect all requests to HTTPS.

And if you have and 3rd party (remote) content, you can also use this to redirect their protocol to HTTPS:

RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
12:20 am on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


Thanks for the info keyplyr +, yes there are several remote contents linking still to the http. So in my .htaccess file, I would just need to append the two lines of code so it will become like below?

RewriteEngine On 
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteCond %{REQUEST_URI} !^/[0-9]+\..+\.cpaneldcv$
RewriteCond %{REQUEST_URI} !^/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]


With the above code, all traffic that is coming from http:// www and non-www, and all traffic coming from https:// non-www will all redirect to https:// www version, while at the same time preserving Google rankings. Is this correct?
12:43 am on Mar 6, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13531
votes: 403


Oh, wait, you've already got a domain-name-canonicalization redirect (the one with HTTP_HOST). If so, the very easiest thing is to make the protocol redirect part of the same rule:
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]
RewriteCond %{SERVER_PORT} ^80$
And then the rest of your existing first rule. The [OR] connector only applies to these two lines; nothing else will change.

Incidentally, the [NC] flag here is wrong. Since you have (correctly) expressed the host as a negative--"anything other than..."--that includes case sensitivity. Don't let people run around requesting "ExAmple.com" when you're really "example.com". And if anyone asks for EXAMPLE.COM in all caps, you can pretty well bank on their being a robot.

Literal periods should be escaped. Here itís a non-lethal error, but you should make a habit of it all the same.
12:47 am on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


Yes, but there's a more succinct way of combining that. However I'm driving across the desert now. Maybe some other code jockey can help?
1:15 am on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


Thanks. I cleaned up the code a bit. So meaning it will be like below?

RewriteEngine On 
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]


Pardon my ignorance as I don't really completely understand complex codes in .htaccess.

Also, I'm wondering the rewrite rule that shows
https://%{SERVER_NAME}%{REQUEST_URI}
does not explicitly indicate that it should go to www version . So search engines will see both www and non-www version within https:// ? Or maybe I am reading it wrong?
1:25 am on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


No, don't combine them like that.

This line:
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
needs to be complete

Also, I'm wondering the rewrite rule that shows
https://%{SERVER_NAME}%{REQUEST_URI}

does not explicitly indicate that it should go to www version

As I said above, you left off the [L,R=301] The R-301 part redirects all requests. You (nor SEs) won't be able to access HTTP pages after you *correctly* install the code I gave earlier.
1:48 am on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


I see, upon reading carefully, I now get what you meant by leaving my first rule. Therefore the end-result in my .htaccess code would be:

RewriteEngine On 
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.example.com$
RewriteCond %{REQUEST_URI} !^/[0-9]+\..+\.cpaneldcv$
RewriteCond %{REQUEST_URI} !^/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]


As you suggested I remove the [NC] as you mentioned that is redundant.
2:21 am on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


I would do it this way (adding: RewriteCond %{HTTPS} !=on)
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
Then your other stuff

But since you are the server admin, can't you default www in the config, instead of using htaccess?

[edited by: keyplyr at 2:39 am (utc) on Mar 6, 2017]

2:38 am on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


Great. So it will end up being like this:

RewriteEngine On 
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]


With regards to setting www at the config, I actually do not know how to do that. Many years ago, most tutorials in the internet use the .htaccess method, so that was how I implemented it ever since.
2:46 am on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


BTW - RewriteBase / is no longer necessary on most servers nowadays. And unless you require redirects making module calls to apr_stat() Options +FollowSymlinks is superfluous as well.
2:53 am on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


Thanks. So in the end, it would simply just be below:

RewriteEngine On 
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]
9:46 am on Mar 6, 2017 (gmt 0)

Preferred Member from GB 

10+ Year Member

joined:July 17, 2003
posts:591
votes: 4


Canonical tags?
9:46 am on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10747
votes: 43


Given my conditions above, if I just leave AutoSSL enabled by default and the free Cpanel certificate installed in my domain name, will this cause duplicate content issues for many search engines especially Google?

as indicated by Dimitri, google usually gets it right, eventually, maybe...
there are plenty of recent threads covering this in this forum. (Google SEO)

my preference is not to leave it up to the algo to sort out your problem.
refer only to canonical urls and serve only canonical urls.

most of the rest of this discussion (the technical solution) belongs in the Apache Web Server [webmasterworld.com] fourm.:
Does this mean that we would need to create over 500 entries of 301 redirect in .htaccess to retain search engine ranking? Or is there a simple bulk method?

there is a bulk method, but it would be interesting to know the reason for these conditions:
RewriteCond %{REQUEST_URI} !^/[0-9]+\..+\.cpaneldcv$
RewriteCond %{REQUEST_URI} !^/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$


these two rulesets are redundant:
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

(you should put a blank line between rulesets)
the second ruleset will never fire.
check if HTTPS is off or check if the server port is 80 but no need to do both.

also i would be as specific as possible in the substitution string for your RewriteRules.
in other words, i would use www.example.com instead of %{HTTP_HOST} or %{SERVER_NAME} unless you can describe why you should be using the variables.

this is your latest version of the rulesets:
RewriteEngine On RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]


with these rulesets, this is what will happen when a request is sent for http://example.com:
- the first request will fire the first ruleset
- the first response will be a 301 status code with a Location: header referring to https://example.com (noting that %{HTTP_HOST} will give you the Host Request header value)
- the second request (for https://example.com) will fire the third ruleset
- the second response will be a 301 status code with a Location: header referring to https://www.example.com
- the user agent won't see your content until it after has made a 3rd request.

here's how i would do this:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [OR]
RewriteCond %{SERVER_PORT} =80
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]

(the allowance for blank is intended to prevent an infinite redirection loop if the server gets a HTTP/1.0 request. HTTP/1.0 requests will not include a Host header, so the value will be blank.)

your final solution will depend on if you still need those REQUEST_URI exclusions you started with in the OP.

[edited by: phranque at 9:51 am (utc) on Mar 6, 2017]

9:50 am on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10747
votes: 43


Canonical tags?

google usually gets it right, eventually, maybe...
there are plenty of recent threads covering this in this forum. (Google SEO)

i've read of plenty of cases where the canonical link was ignored (or misunderstood)...
10:36 am on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


these two rulesets are redundant:

RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

Thanks phranque. They are redundant in a sense, but the second gets rid of the non-secure error if using 3rd party remote apps or content that isn't HTTPS. I no longer have need for it so I don't use it, but I thought I'd present it if someone did use HTTP remote content. The OP said they did.

[edited by: phranque at 1:00 pm (utc) on Mar 6, 2017]
[edit reason] unlinked patterns for clarity [/edit]

11:16 am on Mar 6, 2017 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Nov 13, 2016
posts: 348
votes: 49


This is so much easier with Nginx...

* sorry for the trolling :o)
12:02 pm on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


Tnx for the info phranque. With regards to your question about the REQUEST_URI exclusions, those are actually being added automatically by Comodo/cPanel before every RewriteRule so I unfortunately won't be able to control that.

Nevertheless, I have updated my code to below so I made sure there is a space between rulesets as you've suggested.

RewriteEngine On 
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [OR]
RewriteCond %{SERVER_PORT} =80
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]


@keyplyr, I actually misunderstood your statement previously as I thought you were referring to external links linking to my site via http:// . So I removed the redundancy since my internal linking structure can be https:// .
12:39 pm on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10747
votes: 43


the second gets rid of the non-secure error if using 3rd party remote apps or content that isn't HTTPS

i'm still not sure exactly how the second ruleset could fire or what problem it would solve.

in standard external web server configurations HTTP is on port 80 and HTTPS is on port 443.
it doesn't matter the source of the request - unless a port is specified in the request, it's either HTTP or HTTPS which also means it either came in on port 80 or port 443.

if the request specifies the port this will appear in the %{HTTP_HOST} string.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23
The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI given by the user or referring resource


this means any requested url specifying a standard port is noncanonical and should be 301 redirected to the canonical hostname.
i.e. https://www.example.com:443/whatev should be redirected to https://www.example.com/whatev
this case would fire the 3rd ruleset in killua's "So in the end" version and would also be caught by my suggested ruleset.

if your server is listening to a nonstandard port for nonsecure requests from external apps, or you are serving nonsecure content on port 80 (as i suspect in the REQUEST_URI exclusions in the OP), then yes, there should be an exclusion added to that ruleset.
i'm not convinced there's a requirement for a 2nd ruleset until the excluded content is understood...
12:57 pm on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10747
votes: 43


With regards to your question about the REQUEST_URI exclusions, those are actually being added automatically by Comodo/cPanel before every RewriteRule so I unfortunately won't be able to control that.

as long as you don't care if those resources are served HTTP or HTTPS as requested and as long as they aren't included content in a secure document i don't think those requests will be an issue.
i doubt google indexing of these resources is an issue and if not you might consider using robots.txt to exclude googlebot from crawling these urls if they are discovering them.

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [OR]
RewriteCond %{SERVER_PORT} =80
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]

with these rulesets, this is what will happen when a request is sent for http://example.com...
(please reread my answer above)

my internal linking structure can be https://

when googlebot crawls another site that refers to yours using a non-https and/or non-www url, that is the url they will request from your server.
it's your job to serve google the canonical resource in the fewest number of requests.
that first ruleset isn't helping you.
i would suggest using my version rather than yours:
RewriteEngine On

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [OR]
RewriteCond %{SERVER_PORT} =80
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]
1:10 pm on Mar 6, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2006
posts: 56
votes: 0


So what you're saying is to simplify and to have the lowest number of request:

RewriteEngine On

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [OR]
RewriteCond %{SERVER_PORT} =80
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]


Will the above simple code also at the same time help me pass Google pagerank from external links, meaning other sites that links to mine using a non-https and both www and non-www url?
1:32 pm on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10747
votes: 43


Will the above simple code also at the same time help me pass Google pagerank from external links, meaning other sites that links to mine using a non-https and both www and non-www url?

if i understand your problem completely, this will redirect all requests for non-canonical schema (protocol) and/or hostname in one hop, which is google's suggested method.

this might be informative.
Change page URLs with 301 redirects - Search Console Help:
https://support.google.com/webmasters/answer/93633 [support.google.com]
9:48 pm on Mar 6, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13531
votes: 403


<tangent>
HTTP/1.0 requests will not include a Host header, so the value will be blank.

I was taught this too, and still have the (blahblah)? business in htaccess. But is it really true? Pulling up the most recent day's logs, looking for HTTP/1.0 and then cross-checking against header logs, everything does have a "Host:" header. (And it isn't added by my host*. I asked about it once.) In fact, any time you have more than one hostname on the same server, wouldn't you pretty well have to include a Host: header?

Not that those extra three bytes will do any harm in any case. Just asking.

* Yes, I do realize it is confusing to have two different meanings of the word "Host". Oh well.
</tangent>
10:36 pm on Mar 6, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


HTTP/1.0 requests include a Host header at my server.
11:56 pm on Mar 6, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:11764
votes: 270


Aside from the above, I use absolute path showing http:// for internal links to deter content scrapers.

Don't forget to change the absolute path in your navigation from http to https when you decide to make the server changes.

1:02 am on Mar 7, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10747
votes: 43


I was taught this too, and still have the (blahblah)? business in htaccess. But is it really true? Pulling up the most recent day's logs, looking for HTTP/1.0 and then cross-checking against header logs, everything does have a "Host:" header. (And it isn't added by my host*. I asked about it once.)

HTTP/1.0 requests include a Host header at my server.


check out the protocol definition - Hypertext Transfer Protocol -- HTTP/1.0:
http://www.ietf.org/rfc/rfc1945.txt [ietf.org]
no Host header definition for HTTP/1.0...

there is nothing between the user agent and the server that would typically add a Host header to the request.
if it's not the server/host doing it, perhaps it's a non-HTTP/1.0-compliant user agent claiming to be HTTP/1.0-compliant.
unless you are blocking all HTTP/1.0 requests you should be prepared to avoid the recursion that will happen when there is a blank Host header value.

In fact, any time you have more than one hostname on the same server, wouldn't you pretty well have to include a Host: header?

remember that HTTP_HOST is a "HTTP headers" variable and SERVER_NAME is a "server internals" variable.
the UseCanonicalName Directive configures how the server determines its own name and port:
https://httpd.apache.org/docs/current/mod/core.html#usecanonicalname
this canonical server name value shows up in the SERVER_NAME variable but may not equal the HTTP_HOST variable value if the requested hostname is for another VirtualHost in the configuration.
1:10 am on Mar 7, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:7743
votes: 262


check out the protocol definition - Hypertext Transfer Protocol -- HTTP/1.0:
[ietf.org...] [ietf.org]
no Host header definition for HTTP/1.0...
Dated May 1996. Again, HTTP/1.0 requests include a Host header at my server. Guess things have changed since that document was written
1:37 am on Mar 7, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13531
votes: 403


remember that HTTP_HOST is a "HTTP headers" variable and SERVER_NAME is a "server internals" variable.

I meant that if you (the browser or robot) are sending a request to a server that happens to have multiple domains living on it, how else do you tell the server what domain/hostname you're aiming for, if not with a Host: header?

HTTP/1.0 requests include a Host header at my server.
Which, by amazing coincidence, happens to be the same as ... ;)

Now, it wouldn't surprise me if requests with no Host: header simply never got past the front door of the server--but clearly there exists some mechanism for HTTP/1.0 robots to include the header, even if they're not required to do so.
This 54 message thread spans 2 pages: 54
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members