Forum Moderators: phranque
I created in January a specific sitemap file for the non-HTTPS property, with 300,000+ URLs ('http://example.com/whatever', not 'https://example.com/whatever'), in order to make Google crawl the HTTP URLs and, after seeing that they were 301-redirected, remove them from the crawling list. But Googlebot is still crawling the HTTP URLs and even considering them as "duplicate".
[edited by: not2easy at 1:15 pm (utc) on Aug 6, 2020]
[edit reason] split/move intro [/edit]
to https: or one of the rules like RewriteCond %{HTTPS} !on and you can learn both how and why. Creating a 301 redirect for each http link towards the https version?Yes, but that doesn’t mean you need to write out a separate rule for every page on the site! Make a single rule that captures the request and redirects to the HTTPS version of the same request.
I was also curious if someone would propose some canonical method.If you can clarify your use of 'canonical' as to whether it relates to the domain canonicalization that is typically handled in rewrite rules or whether it relates to canonical meta data that relates otherwise duplicate pages at various URLs to the preferred version - such as is common in WordPress - then we can offer suggestions.