Forum Moderators: Robert Charlton & goodroi
All links inside the shopping cart that link back to the site are fully qualified HTTP paths to avoid visitors getting stuck using SSL outside of the cart.
Additionally, of my pages are canonicalized to the http: version and I even use the meta canonical as well as redirects so I figured it was safe, right?
Recently Googlebot and MSNbot both snuck in the back door while I wasn't watching.
The solution to the problem was of course posted in WebmasterWorld:
[webmasterworld.com...]
I quickly deployed a robots_ssl.txt file per the instructions and MSNbot almost immediately quit requesting pages as it frequently checks robots.txt.
Googlebot on the other hand only checked once today, sometime right after midnight, hasn't bothered to check it again ALL DAY and is eating nothing but redirects to the HTTP server and doesn't seem to take a hint.
Perhaps Google will self correct sometime tomorrow.
I noticed Tedster commented:
The best practice is to install the secure certificate on a dedicated subdomain, such as secure.example.com
Which is a good idea in theory but not in practice (or very practical) as many ecommerce solutions use a single directory for housing SSL and non-SSL content and most server admin control panels only give you the option to share a single directory or not, but it's the same domain name.
Basically, although I could implement secure.example.com, I think it's out of reach for the average webmaster running ecommerce.
However, this topic brought up an interesting question in my mind that I don't think I've seen debated before:
Does allowing Google to crawl your SSL server make it rate your server as slow(er) or does Googlebot account for the protocol time variant in their equations?
IMO, just the concept that you *MIGHT* get tagged as "slow" in the event of an SSL crawl is enough for everyone to deploy a robots_ssl.txt file ASAP!
Anyone have any evidence one way or the other if Googlebot crawling SSL has any impact in the SERPS?
Usually when someone brings me a site where https is being indexed, their rankings are already in the tank, so crawler speed is the furthest thing from their minds. If I get another example, I'll see if I can learn anything in this direction.