Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

After HTTPS migration... what to do with robots.txt sitemap?

         

killua

7:42 am on Jun 21, 2017 (gmt 0)

10+ Year Member



Today, I just migrated to HTTPS (301 redirect site-wide successfully implemented). I'm confused however with regards to the Sitemap link located in my robots.txt file as some people say to change it immediately, while some say leave it for a while:

Sitemap: http://www.example.com/sitemap.xml

The question is when should I change the above inside the robots.txt to my new sitemap file that contains https links:

Sitemap: https://www.example.com/sitemap2.xml

Is it after about a month that search engines are showing https in the index?

[edited by: Robert_Charlton at 8:35 am (utc) on Jun 21, 2017]
[edit reason] Changed domain.com to example.com... It can never be owned [/edit]

robzilla

8:45 am on Jun 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would put up the HTTPS version straight away, so as not to give mixed signals.

How long it takes for the index to show all your pages as HTTPS depends on the size and popularity of your website. It could be a few days for a small site, or multiple months for larger ones.

Cross-referencing some anecdotal evidence from keyplyr in Am I going to be punished for NOT having a sitemap? [webmasterworld.com]:
so far I've switched approx 30 sites to HTTPS. A dozen or so had an existing sitemap. After I updated the file paths to HTTPS & submitted the new sitemaps to Google, Bing & Yandex, the new pages updated to the SERP a couple days faster & more inclusive than the sites that did not have sitemaps.

killua

9:39 am on Jun 21, 2017 (gmt 0)

10+ Year Member



I see. Is it OK to have a different filename for the new sitemap to be used in robots.txt? I mean instead of the usual sitemap.xml , I'll use sitemap2.xml for the updated HTTPS file paths. The reason is the regular sitemap.xml is currently being used in my Google Search Console for the http property, which many people recommend to leave for awhile.

keyplyr

9:58 am on Jun 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



robzilla gave you good advice:
I would put up the HTTPS version straight away, so as not to give mixed signals
Change the paths in your sitemap.xml to HTTPS and resubmit to the big 3 SEs.

You may read lots of opinions to leave old sitemaps, but it is unwise. Only one sitemap should exist that has the same files.

not2easy

1:52 pm on Jun 21, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Once you have made sure that those old URLs are all being served as https you should update the robots.txt and serve the "same" sitemap (not serve two different sitemaps) so there is no confusion. I did not delete my old sitemap from the "old" GSC acct until the new https pages had all been indexed, but I did make sure that if I tried to visit the old http sitemap that I was served the new version.

If you can still access "http://www.example.com/sitemap.xml" then it needs some work on your 301s. Be sure to add the "new" https site in GSC and do not use the "Change of Address" form. That new account is where you can submit the new sitemap that may speed indexing.

Leave both sites in GSC so you can check the old version to make sure no pages are still being indexed as http. And within a week you should be able to see the indexed pages for the "new" https site climb as the old http: panel shows a similar decline.

lucy24

5:54 pm on Jun 21, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sitemap: http://www.example.com/sitemap.xml

Unless you have coded an exemption for /sitemap.xml, nobody will ever see this sitemap anyway. All they'll see is the redirect. Why would you show search engines an outdated link?

It's especially troublesome if the sequence becomes
http://www.example.com/sitemap.xml
>> redirect to
https://www.example.com/sitemap.xml
(i.e. the identical file)
>> search engine continues to request everything they find in this file, with its old http:// addresses

File under: Mixed Signals. Or, worse yet, file under: Poor Technical Quality.

killua

7:18 am on Jun 28, 2017 (gmt 0)

10+ Year Member



It's just a week after I initiated the site-wide 301 redirect and so far, my migration to HTTPS is going smooth, and the vast majority of my 500+ pages site is now showing https in Google search engine.

There's just one thing I'm concerned about. I have two properties setup in Google Search Console--one is for HTTP and the other is HTTPS. The HTTPS property of course has the new HTTPS sitemap. The HTTP property still has record of the old HTTP sitemap which it processed more than a week ago before I did the migration.

Now is it OK to use the delete button to remove this old sitemap in Google Search Console? I'm afraid to do it since I'm not very sure if deleting a sitemap in the webmaster tools will mean that I'm telling Google to drop my entire site from the index.

keyplyr

7:42 am on Jun 28, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just leave the old sitemap in the old http property. Doesn't hurt anything. Don't forget you have a 301... so everything, including requests for sitemap, get redirected.

not2easy

3:07 pm on Jun 28, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I disagree about leaving the old sitemap once the majority of pages are shown as indexed under the https account. If you are seeing "one group" of pages not being indexed it can point you to things that need attention such as a forgotten folder of images or css assets. When you see no patterns and the majority of pages are indexed as https, then the old sitemap can send confusing signals. I deleted the old one after the uptick in https continued and there were no anomalies shown. the larger the site, the longer that can take, but there's no reason to leave the old sitemap in GSC once it has served its purpose.

keyplyr

5:58 pm on Jun 28, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@not2easy - that's 2 different things.

The question was whether to remove the old sitemap listing in the old GSC property for HTTP and it doesn't matter. That's the *old* property... all that info is obsolete. You can remove it all or let it stay.

There should only be one sitemap online. The one that includes the actual paths to your pages. In this case that sitemap is the one with HTTPS pages.

System

12:29 am on Jul 3, 2017 (gmt 0)

redhat



To keep this discussion more productive we are going to focus on https and sitemaps.

If you want to discuss https & traffic drops, please head over to this other thread google/4856186.htm [webmasterworld.com]

[edited by: goodroi at 1:01 am (utc) on Jul 3, 2017]