Hi, i want to describe my experience with a canonicalization of duplicate pages in the ecommerce website i am managing. It has around 200.000 pages
I discovered that the site has two versions of each page: one with the .html extension and one without. Both were indexed on Google SERPS with a self canonical tag. So, following the SEO basic knowledge to avoid duplicate content, my idea was to canonize a version to the other, in order to get higher ranking and avoiding page cannibalization.
So, finally we did it.
Both versions are still online (my boss don't want to remove one of the versions, even if it impact on our crawl rate), but we canonized the without.html pages through the .html versions (even if the non.html were better positioned in SERPS).
After 6 months, the pages without .html extensions are not anymore indexed, but the .html pages didn't increase positions and traffic. So basically we just lost the traffic coming from non .html pages. I think that this behavior is very strange.
So, I have 2 questions:
1. Did I act in the right way a SEO expert would do? In few words was my decision in line with SEO principles? (I think so, but Google didn't appreciate it)
2. Can we get back and auto canonize both versions of the pages to try getting the lost traffic? Should this reverse action impact more the indexation of the version we previously chose?
Thank you in advance for all your anwers.
Cheers