Welcome to WebmasterWorld Guest from 188.8.131.52
I do a site: search and right there in the #1 position is my home page with an https link.
And no, we do not link to the home page with https.
There are no inbounds that we can find linked to the https version. And I would have thought that by default, if the damn page and/or directory is not set up as https that there should be no reason for you to index it as https.
I'm on a Windows Server running IIS 6. I understand there is a way to prevent this on Apache, how do I do it on IIS?
I don't think there is anything new here pg1. The https indexing was first spotted in 2002:
Looks like an unfriendly linked to you with https in the url?. Same with the nonWWW and ip address versions of your homepage. It is a classic compeitor trick to mess with your pr values.
I say - tweak it at the server level and that should fix it up?
We just been through the same problem.
We fixed it by removing the SSL cert from the main site and serving the https pages on the .net version of our domain.
When the https page(s) show true 404 errors via the headers you can ask Google Help to remove the pages and they will do it quite quickly.
The Google remove tool will not work with https and would take ages anyway.
The next stage was to make ALL internal link absolute because if you use relative links you only need one site to link to your homepage with https to find EVERY page in your site indexed as https. It happended to us.
Two weeks after it happend we are now back with a http homepage site starting to get indexed and all the https pages removed.
Google Help can sometime be quite efficient they simply pick and choose subjects they are willing to assist with. Https pages is one of them!
Be careful if you use a windows server via IIS.
Windows servers can only have one SSL cert per IP address.
If the SSL cert is removed from the main domain and a new/ammended cert is added to a new sub domain with the same IP address a call for:
Therefore incorrectly indexed https main domain pages will not show a 404 and may not be removed.
You can get around the above by ensuring that the secure pages on the sub domain have different page urls to the non secure pages.
The index page will be a problem as it will always be index and you may wish to delete the index page on the sub domain until Google have removed the incorrect [maindomain.com(...] listing from the index.
These are the answers I've gotten so far which pretty much are inline with the above...
From the reading I've done so far, there is not an IIS server config solution to do this. You'll need to handle it one of 2 (or more ) ways.
1. Global include above header as include file in all HTTP only pages.
2. Separate out the https and http application, example, make new subdomain "secure.example.com"
Then you'll need cert for that domain, and all https traffic is written to [secure.example.com"...] and we disable SSL on the www domain, so you don't have to deal with the redirect code on top of each page or the competitor sabotage thing.
And then I have to come back to my original question, why would Google index an https link like that? This technical stuff makes my brain hurt! ;)
1. somewhere there is a relative link from an https: page (so the protocol does not show in the base url)
2. someone is intentionally pointing https: links at a url.
Using a subdomain for the secure protocol is the solution that all my long term clients have in place -- their http: urls will not resolve with the https: protocol, so those urls don't get indexed, even if they appear somewhere on the web. Then on secure pages, any links to http: pages need to use full, absolute urls.
This is one of those areas where setting it up well from the start is a lot easier than fixing things after trouble develops. I'm still scrambling to uncover best practices for a repair job.
[edited by: tedster at 5:26 am (utc) on April 1, 2006]