Forum Moderators: phranque

Message Too Old, No Replies

Protocol types, https pages & Google indexing problems

Whole sites being dropped

         

Marcia

8:40 am on Apr 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There have been some people having a problem with https pages being indexed by Google, apparently causing some duplication problems, and entire sites ending up being dropped from the index.

I see in my FTP program that there are different Protocols, two of which are:

http access - Port 80
https access - Port 443

Is there a way of using .htaccess or by the host configuring at root level, as it would have to the case of shared virtual hosting, to prohibit fetching and indexing of https pages by limiting Port 443 access, even though there may be some links to secure pages?

jdMorgan

3:00 pm on Apr 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can detect search engines by user-agent and IP address, and redirect any of their requests for HTTPS port 443 to HTTP port 80. This is OK as long as the HTTP/HTTPS content is identical. If not, then it's deceptive cloaking.

Jim

Marcia

4:54 am on Apr 30, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Figured something like that, or maybe just putting all the https pages into a subdirectory or subdomain and excluding them altother with meta robots and robots.txt. could that work?

Either way, I guess it's still not a good idea when people link to their https pages from their regular pages.

Thanks!

simonmc

1:53 pm on May 15, 2006 (gmt 0)

10+ Year Member



How can I configure my htaccess to point the googlebot away from my https pages

You can detect search engines by user-agent and IP address, and redirect any of their requests for httpS port 443 to http port 80. This is OK as long as the http/httpS content is identical. If not, then it's deceptive cloaking.
Jim

I have the same content for https as I do for http and this is causing my site problems. Doing a site: command now shows the https version of my home page.

I can't put the https in a different folder so I would just like to point the googlebot in the correct direction when going to port 443.

Thanks