Forum Moderators: goodroi
Above both url contain same content.
So how can i stop google to crawl my page with https://
and is there any bad affect to my site or my ranking with this type of issue?
From an old Google web page, assuming this is still accurate:
Each port must have its own robots.txt file. In particular, if you serve content via both http and https, you'll need a separate robots.txt file for each of these protocols. For example, to allow Googlebot to index all http pages but no https pages, you'd use the robots.txt files below.
For your http protocol (http://yourserver.com/robots.txt):
User-agent: *
Allow: /For the https protocol (https://yourserver.com/robots.txt):
User-agent: *
Disallow: /
However, this is a problem if your HTTP and HTTPS share the same root directory and would require a small PERL or PHP script to serve up the proper robots.txt file depending on whether or not the secure server was being used.
And while you're at it, add some rules so that HTTPS pages are redirected if requested via HTTP, and HTTP pages are redirected if requested via HTTPS. Just one of many "canonicalizations" you should do so that each page on your site is directly-accessible by one and only one URL...
Jim