Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Our https urls are now listed - the http versions are dropped!

         

Ellio

8:38 pm on Feb 26, 2006 (gmt 0)

10+ Year Member



We have just noticed that Google has indexed the [www...] versions of our homepage and most important internal page.

This has resulted in the [www...] versions being de-listed!

There are absolutely no links to the htpps:// address's and they only work as the SSL cert is applied to the domain for use with our secure form pages.

How and why has Google incorrectly indexed the htpps:// versions of these pages? There must be thousand of sites with SSL certs applied but where most pages do not use the secure versions.

We use a Windows server and am sure you cannot exclude https in robots.txt without excluding the http version in the process.

Any ideas out there?

tedster

2:31 am on Feb 27, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here's a thread from December that went into the https: indexing issue at some depth:
[webmasterworld.com...]

Ellio

5:42 pm on Feb 27, 2006 (gmt 0)

10+ Year Member



THanks Tedster,

Problem is that thread assumed that anyone with https indexed had links pointing to those pages.

In my case their are no links at all.

I was my understanding that when a SSL cerificate was applied to a domain using a windows server then any page prefixed with a https will work in secure mode.

Are people saying that the SSL cert dhould be removed and added to a sub domain or other domain?

tantalus

11:05 am on Feb 28, 2006 (gmt 0)

10+ Year Member



Just a thought, but do you have any adwords pointing to those https urls?

Web_speed

12:48 pm on Feb 28, 2006 (gmt 0)



Something similar has happened to one of my sites, not only "https" version but URLs that have full shopping cart parameters in them. These URLs are not linked from anywhere and there is only one way to trigger them, they appear only when a shopper actually goes through the check out process.

I was thinking maybe Google was following the shopping cart's form submit URL but no (method=post), the script will return an error when the URL has no other parameters in it (like quantity, date stamp etc.) and will exit with a blank page, yet the URLs with all required parameters appeared in Google under some of my pages.

Which leads me to believe that Google must have been using the toolbar to collect URL data which it later feeds back to the crawlers to check... this may perfectly explain why https and other dynamic URLs which are not linked from anywhere suddenly appear in the index.

What can be done about it?.....not much relay accept for maybe write Google and explain the situation.

P.S.
BTW, I noticed the same happening to chitika.com . Up to about 3 weeks ago they were having their domain come up with the https version every time you did a search for "chitika"....the problem has gone away since.

Ellio

11:14 pm on Feb 28, 2006 (gmt 0)

10+ Year Member



I wrote Google and their reply was predictable:

Something like - If you do not want you pages indexed then change your robots.txt file and they helpfully included a robots.txt file example showing how to ban all search engines from our site!

The helpdesk really has not got a clue. How you get to speak with an engineer is anybodies guess.

Thanks to Cyanide I have now realised that we did have a single relative link on some of the secure pages. This means the link would have loaded our terms page in https format, as the form page was loaded as https, and then a relative link from that page to the homepage would have also been https!

Changes have been made so I hope it will get fixed soon.