Welcome to WebmasterWorld Guest from

Message Too Old, No Replies

Content duplication in google. How to fix?



9:54 am on May 30, 2013 (gmt 0)

I'm running the online store that has the problem with content duplication in google.
The category pages have a lot of filters, sorting and display options - of course each option modifies the url. The plugin that sets rel=canonical for each of these pages was configured improperly leading not the basic category page with the products (as it should) but to not existing page. Of course google ignored it and generated many duplicate pages in search results (we have about 6000 real pages and google thinks it's about 42000). Because of this we have now only about 200 pages in google's main index and all the rest are in supplemental so the whole site is pretty muched considered like being low quality to google.
We already fixed the problem with the plugin so rel=canonical is set properly now, besides that I configured google crawler in Webmaster Tools to ignore all url parameters.
I assume it will take a lot of time for google now to delete the duplicates so I have 2 questions now:
1. Should I block the duplicate pages in robots.txt? I heard it's not too good idea as it's suspicious to algorithm to hide almost whole site.
2. Is there anything else I can do to get my site to main index?


6:52 pm on May 30, 2013 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Welcome to the forums, websheff.

It sounds like you've taken the best steps. No, I would not block those URLs in robots.txt. If you do, Google will never see your new canonical links.


7:00 pm on May 30, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Sounds like you have it fixed.

It will take many months for the indexing to be fixed.

Keep an eye on webmastertools and analytics as well as the site logs for anything going wrong.

Test a large variety of valid and non-valid URLs to make sure you get the right responses.


8:57 am on May 31, 2013 (gmt 0)

10+ Year Member

So in GWT you asked Google not to crawl any URL parameters?

Featured Threads

Hot Threads This Week

Hot Threads This Month