|Support For Legitimate Cross-Domain Content Duplication|
Support For Legitimate Cross-Domain Content Duplication [googlewebmastercentral.blogspot.com]
|For some sites, there are legitimate reasons to duplicate content across different websites — for instance, to migrate to a new domain name using a web server that cannot create server-side redirects. To help with issues that arise on such sites, we're announcing our support of the cross-domain rel="canonical" link element. |
The above was posted by John Mueller of Google on the December 15, 2009, Official Google Webmaster Blog, and Mueller is careful to emphasize that the rel="canonical" is not a substitute for a 301 redirect, if you can do a 301.
|But if a 301 redirect is impossible for some reason, then a rel="canonical" may work for you. |
While the rel="canonical" link element is seen as a hint and not an absolute directive, we do try to follow it where possible.
There's a companion post, from October 06, 2009, about reunifying duplicate content within your website [googlewebmastercentral.blogspot.com], which Mueller recommends reading first.
In both posts, he emphasizes that, in lieu of a 301, the rel="canonical" link is better than blocking the page....
|One item which is missing from this list is disallowing crawling of duplicate content with your robots.txt file. We now recommend not blocking access to duplicate content on your website, whether with a robots.txt file or other methods. |
Read fine print carefully with regard to which blocking methods you shouldn't use, depending on which kind of duplication you're fixing, as blocking the page will conflict with the rel="canonical" link element.
Any comment on whether this opens vulnerabilities for abuse?
... more reason not to allow HTML in user-generated content
It's a certainty that possible exploits are being investigated as we post. We'll soon see if the months of waiting for Google to support the cross-domain canonical tag gave them the time they needed to plug the holes well.
One thing we do know, the canonical redirect must point to a URL with "substantially similar" content in order to kick in. That eliminates a bunch of potential trouble.
|more reason not to allow HTML in user-generated content |
the canonical link element is in the head.
how will ugc affect this?
|the canonical redirect must point to a URL with "substantially similar" content |
it is important to note that the canonical link element is NOT a redirect.
quoting the linked JM post:
|Keep in mind that we treat rel="canonical" as a hint... |
I think it should be said that if ranking both copies is your goal, run away now.
Are Google making the Microsoft mistake of introducing "features" without consulting with other relevant parties?
I'm still furious at the mess of 404 requests in my access logs caused by Microsoft's ingenious favicon.ico link squatting "feature".