How long since you introduced the canonical link tags? Also, are all the duplicate URLs creating exactly the same content or is there some content variation?
The canonical tag is just a suggestion, I use it extensively and WMTs still complains about multiple pages with the same title.
How I resolved the problem is to make all page variants redirect to the canonical page and POOF! all the duplicate page title issues went away.
Not knowing your situation, but I have user generated content and sometimes the titles really were duplicates requiring manual intervention to correct.
About a year.. Yep, they are the same.
they are all the same and the canonical tag tells googly that the first one is the one.
incrediBILL, i thought about redirecting BUT, it's an ecommerce website, with 1000s of products and new products are coming in weekly. It would be very time consuming to redirect each URL. Unless of course there is a way to automate it!?
Is it safe to assume that the parameters change the behavior of the page for the customer to use only and you really don't want anything indexed beside "example.com/news"?
A real simple way to handle that with automation, if it's truly the situation, is to make any page with parameters that you don't want indexed to simply include META NOINDEX in the header of the page.
Google and all the other SEs will drop those redundant pages real fast.
Canonical problem solved and there's no interference with the customer experience which redirects would cause in the URL scenario you presented.
Just make sure you don't accidentally NOINDEX the wrong pages ;)
Not sure if there could be a problem if the canonical tag contained a mixed case pagename at different times (eg FAQs.asp and faqs.asp).
I've just had to fix a self-inflicted fault on one site and thought of the above whilst doing it. The pagenames and their links that I installed were lower case and someone else added links in mixed case. I understand google treats them as different? I have now added a bit of code to lower-case all links and all canonical tags.
The problem I was actually fixing may be of interest here:
I accidentally used the wrong domain name in the canonical tag: my-example.com instead of this-example.com. Found it when analyzing why the home page didn't show up, although the other pages did (all pages had the same canonical error).
When forced, Google showed the home page with OUR title and site but with a snippet from the wrong site. Odd: I would have expected it all to be wrong, not mixed like that. I think the other pages showed correctly because the canonical "targets" probably don't exist so my site's pages were parsed instead.
Yahoo/Bing - no home page problem. Could be they ignore cross-site canonicals.
|Not sure if there could be a problem if the canonical tag contained a mixed case pagename at different times (eg FAQs.asp and faqs.asp). |
You must be on IIS because in Linux/Apache you get a "404 not found" if the file name is in the wrong case, as Linux is case sensitive.
I use upper / lower file names like "MyPage.html" but I'm always consistent and canonicalize it that way.
The best part is when stupid scrapers scrape my site, usually Windows scrapers that aren't aware the rest of the world is case sensitive, they sometimes lower case all the URLs and then I catch it at the source as people start hitting my site with 404s from the flawed links.
Gotta love it ;)
|Yahoo/Bing - no home page problem. Could be they ignore cross-site canonicals. |
That's correct. Currently only Google supports cross-domain canonical links.
Bill - yep, IIS. Could never decide if casing was good or bad. I've worked on linux as well as MS and it annoys me when I can't access a folder or filename because I've forgotten the rule or which letters are cased. :(
Tedster - thanks for confirming. :)
|Is it safe to assume that the parameters change the behavior of the page for the customer to use only and you really don't want anything indexed beside "example.com/news"? |
Would you recommend any other method other than the noindex?
How would you implement noindex on dynamic pages?
To implement NOINDEX on dynamic pages:
Compare your canonical URL for this page with the URL you will be returning. If they are identical, do not output NOINDEX, if they differ, output NOINDEX.