Forum Moderators: phranque

Message Too Old, No Replies

Issues Encountered before and after Switch to HTTPS

         

guggi2000

10:02 am on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month




System: The following 44 messages were cut out of thread at: https://www.webmasterworld.com/webmaster/4836604.htm [webmasterworld.com] by phranque - 10:34 pm on Mar 21, 2017 (utc -7)


Do you think this gives a yellow warning in some browsers after switching to Https?

<META property="og:image" content="http://www.example.com/images/logo.png" />

keyplyr

10:13 am on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Og tags in the HEAD should not cause an unsecure warning via browser if you have implemented a 301. All my og tags are HTTP as well.

guggi2000

10:29 am on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks...

One more: Do we need to update the Google Analytics? Do we need to add another property there?

IanCP

10:30 am on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



And BTW - the link you posted does not point to your account. It defaults to everyone's account when accessed.

I most certainly hope so. I also invite everyone to check their account. If you don't have one, then make one.

ChanandlerBong

10:31 am on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



my biggest worry about switching to https is that I already have 2 or 3 301 redirects happening in my htaccess

1. .html >> .php
2. non-www to www
3. A few file specific ones: /old.php >> /new.php

and now I have to throw on top of that mess http >> https
!

guggi2000

10:43 am on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



@ChanandlerBong that is a very good point. However, keep in mind that after a will G and external links will point to https by default...

The question is where do you put the Http to Https Redirection? Do you put it in the beginning or end of the file?

keyplyr

10:53 am on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Discussions about .htaccess file code should be posted in the Apache Forum [webmasterworld.com]

guggi2000

11:22 am on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



@keyplyr I think everyone reading this thread will have to do a https redirect. This is a very specific issue, not a general one. Determining where to place the https rewrite rule as a best practice can improve server performance for everyone.

phranque

11:23 am on Mar 20, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



1. .html >> .php
2. non-www to www
3. A few file specific ones: /old.php >> /new.php

and now I have to throw on top of that mess http >> https

The question is where do you put the Http to Https Redirection? Do you put it in the beginning or end of the file?

typically the order is external redirects before internal rewrites and then most specific ruleset to most general ruleset.
the specific redirects should specify the canonical protocol and hostname in the RewriteRule substitution string.
the general hostname canonicalization redirect should be the last redirect before the internal rewrites.
this ruleset can handle both www vs non-www and http vs https, an example of which i posted 2 days ago in this thread.

keyplyr

12:00 pm on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do we need to update the Google Analytics? Do we need to add another property there?
I haven’t used Google Analytics in years, so I don’t know for sure. I would image you would just connect your Analytics account to the new site using that tool in GSC.

You may get a better response in the Website Analytics Forum [webmasterworld.com]

guggi2000

12:10 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Just started the switch, after month of preparation... but there are always surprises.

Does anyone know how to redirect a folder that is served through Tomcat (forwarded through AJP) ?

Example:
http://www.example.com/webapp/...

(it can be in Tomcat, but we prefer to do it on the server level)

guggi2000

12:35 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Ok. I got the answer through our hosting company.

(I know it is not a redirection forum, but you may want to add it to the checklist... because we forgot).

If you use Tomcat or another Framework that uses port forwarding under certain conditions, turn off these conditions in your HTTP Virtual Host file. Define these conditions only for HTTPS...

In example: If your rule says that you should forward all incoming requests of this type "http://www.example.com/webapp/*" to the WebApp you have to make sure this rule is not enabled when the request is HTTP. This way the request will go through the normal port and the htaccess will catch it and redirect as desired.

ADD TO CHECKLIST: Non-Apache based redirections

guggi2000

2:56 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Another question regarding Google Search Console: What will happen if we never used sitemaps with the HTTP and will now NOT use them either?

robzilla

4:02 pm on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Another question regarding Google Search Console: What will happen if we never used sitemaps with the HTTP and will now NOT use them either?

Err... nothing? :-) Sitemaps are not mandatory.

guggi2000

4:34 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks Robzilla, yes they are not mandatory. But will the switch take longer? We have around 200,000 pages.

We have spent 4 days in preparing a sitemap. But I am afraid there will be more harm than gain... Google has always indexed us very well and every page has the correct inner links.

phranque

4:39 pm on Mar 20, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



ADD TO CHECKLIST: Non-Apache based redirections

these aren't redirections - this is a forward proxy which is more like an internal rewrite from the client point of view.

the redirection response provides a 301 status code and a Location header to the client.
with a forward proxy, the client is mostly clueless about what happened behind the scenes.

i would describe this more accurately as:
move mod_proxy directives from the secure VirtualHost container to the non-secure VirtualHost container as required.

robzilla

5:26 pm on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does it matter much how long the switch takes? Until a page is recrawled, the HTTP version will continue to rank, and users will end up on HTTPS via your redirects. As far as I'm aware (or concerned), a sitemap is not necessarily going to speed up the crawling process. Every page in the index will be recrawled eventually. Once you've done your part, and done it well, I believe you can let Google figure it out; I wouldn't try to force anything.

guggi2000

9:47 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Can anyone think of why Google Search Console says "robots.txt fetch error"?

We have only one robots.txt with relative link.The location of https and http is the same and the robots can be accessed via

https://www.example.com/robots.txt

keyplyr

11:37 pm on Mar 20, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What server response code is associated with the message you got?

Could be the syntax within the file. If so, use the tool to tell you what's exactly the issue.

Could be a retrieval issue. If so, make sure your file actually is where you think it is & resubmit.

guggi2000

11:50 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



The file is where it was and is unchanged. I assume it is a retrieval issue, unless HTTPS does not allow relative links for some reason...

Resubmitted, nothing yet.

GSC does not give the response. The file can be accessed perfectly through browser. No IPs are blocked on our server...

Stopped the 301 for now...

guggi2000

11:55 pm on Mar 20, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



And by the way GSC is mixing up data between the properties...

It says for "robots.txt": Last seen 03/20 10:00 am . That is before the property even existed! And then the Dashboard shows red error on Robots.txt Fetch

lucy24

12:57 am on Mar 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



And then the Dashboard shows red error on Robots.txt Fetch

I think the question was: What sort of error? Don't rely on GSC to tell you. Make a note of the exact time you asked Google to stop by, and then check your logs for that minute. What numerical response code is recorded for their robots.txt request?

phranque

3:04 am on Mar 21, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



are you just submitting your new robots.txt or have you also tried the GSC testing tool?

Test your robots.txt with the robots.txt Tester - Search Console Help:
https://support.google.com/webmasters/answer/6062598 [support.google.com]

[edited by: phranque at 2:26 pm (utc) on Mar 21, 2017]

guggi2000

6:31 am on Mar 21, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



The robots.txt works fine through the Tester. We have no IPs blocked either.

It is most probably a GSC bug, I will explain:

We added the HTTPS property of the site to GSC yesterday, March 20th. However, the robots.txt crawl error mention in the Dashboard shows a date 2 days ago (March 19th). We did not have that property on the 19th. Furthermore, we did not have SSL enabled 2 days ago...

So maybe Google stepped by to check a robots.txt on its own by just checking our site through https and then keept the unsuccessful crawl somewhere and then attached the error to the new HTTPS property. So, it's either very user unfriendly or a bug.

We stopped the 301 for now until we see that the property can be indexed. I hope there was no harm so far.

What do you think?

keyplyr

6:40 am on Mar 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think if your other pages are accessible via HTTPS, a GSC robots.txt error probably has nothing to do with the 301 and it wouldn't be a reason to delay moving forward.

guggi2000

6:57 am on Mar 21, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



We cannot ignore a GSC robots fetch error, because if it's a real error, Google will not index the https site. The error was not a 404 (does not exist). It was a robots.txt unreachable error. Maybe because we blocked all 443 traffic before going live with the SSL.

What if Google decided to check again in 1 month? We cannot redirect to a site that cannot be indexed...

guggi2000

7:12 am on Mar 21, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



One more bizarre thing:

- GSC https "Crawl Stats" shows the stats of the http version. Isn't the new property supposed to be a completely new site?

- GSC https "Blocked Resources" says "Google has not yet processed your property."

Conflicting data

keyplyr

7:25 am on Mar 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you're opening the correct report, there won't be anything there for a couple days.

Make sure you open the HTTPS version from that drop-down list at the top. As your pages get crawled, the info will start to populate the reports. Likewise, your old HTTP site info will die off.

guggi2000

7:32 am on Mar 21, 2017 (gmt 0)

10+ Year Member Top Contributors Of The Month



Make sure you open the HTTPS version from that drop-down list at the top.

I did, but thanks for pointing out that a little S could have gone unnoticed

If you're opening the correct report, there won't be anything there for a couple days.

Wrong, it shows the data from the non-http version... and a fetch error with a date prior to the definition of the property

Has anyone seen such a mix of data?

keyplyr

7:41 am on Mar 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That could happen if you are turning on & off your 301. Leave the 301 active and give things a few days. It will all work out.
This 69 message thread spans 3 pages: 69