Welcome to WebmasterWorld Guest from 34.204.189.171

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google indexing pages twice (http & https), only 2 should be https

     
7:36 pm on Nov 11, 2014 (gmt 0)

New User

5+ Year Member

joined:Apr 23, 2014
posts:3
votes: 0


I just launched this site 2 weeks ago. There are two registration-like pages that I want to be secure. I'd prefer to keep the rest of the site non-secure (http).

A day or two after the site went live, it was indexed in Google. The problem is, every page was indexed twice- once with http, the second time with https.

I've looked around for threads RE: proper way to use https, but everything I've read has been all or none. With my case, I want to use both http and https on the same site.

5 days ago I created two robots.txt files, one served for non-secure and the other served for secure. The secure robots.txt disallows everything (I don't mind if the 2 secure pages are not indexed).

As of today, every page is still indexed twice. I'm not sure if my http/https robots.txt is a solution, or if there's something else I should do. Has anyone run into something like this before?
12:10 pm on Nov 12, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 2, 2014
posts:741
votes: 425


You may want to add the https version of the site to webmaster tools and remove the unwanted https pages yourself with the URL removal tool, which would be the fastest solution to your problem. You can also use rel=canonical to point to the appropriate page which is a good practice to use on all sites imo.
12:47 pm on Nov 12, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:June 19, 2008
posts:1337
votes: 121


huhwaitwhat,

first of all: is a Registration page something that should be in the index? Is it reaky usefull regarding the queries of the users?
second: put a 301 redirect in the .htaccess for the http pages
It takes time for Google to find and process the 301 but it works.
Donīt put a rel=canonical link on the http page as this is only a hint for Google and both sites are still in the index.
Donīt work with robots.txt as this will not prevent Google to find the http-Version.
3:48 pm on Nov 12, 2014 (gmt 0)

New User

5+ Year Member

joined:Apr 23, 2014
posts:3
votes: 0


Thanks for your suggestions. I will keep this thread updated.

As of yesterday, everything was completely removed from the index (both http and https pages). I've removed the secure robots.txt.
9:39 pm on Nov 12, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:July 19, 2013
posts:1097
votes: 0


This is a situation where I'd use a rel=canonical rather than a 301. Why? For the visitors -- I know, crazy, but if I have a security certificate and someone wants to surf the site securely it costs me nothing to let them.

As far as SEs go, point the canonical on the pages you want indexed [or at least to show in the results] with http to the http version and the canonical on the registration page(s) to the https version.

Note: It does not really matter if both versions show in a site: search or not if you're using rel=canonical, because Google groups duplicates together and then assigns the signals from all URLs to the one determined to be the canonical for ranking purposes, which you are indicating.
6:01 pm on Nov 19, 2014 (gmt 0)

New User

5+ Year Member

joined:Apr 23, 2014
posts:3
votes: 0


Thanks for the info, JD. I feel a bit better now.