tedster - 3:50 am on Aug 19, 2010 (gmt 0)
In the past week or so I've read several accounts around the web of so-called "canonical disasters." We also had several recent threads here describing problems that seemed to be caused by the canonical link - including one where there was an accidental ".com.com" so the canonical tag read example.com.com. And com.com resolves any wild card subdomains, so that messed with this members indexing and ranking.
I don't get it. This tag was introduced a year and a half ago and it should be relatively straightforward, if Google follows through on what they originally said:
Is it okay if the canonical is not an exact duplicate of the content?
We allow slight differences, e.g., in the sort order of a table of products. We also recognize that we may crawl the canonical and the duplicate pages at different points in time, so we may occasionally see different versions of your content. All of that is okay with us.
Is rel="canonical" a hint or a directive?
It's a hint that we honor strongly.
So what kind of "slight difference" allows a deep internal page to be canonicalized to the home page - with totally different content? Or to a page on com.com? Google is supposed to IGNORE the tag - treating it as a "strong hint" but only if the content has only slight differences, right?
I've used the canonical link tag with no apparent problems, and in some cases it put an easy band-aid on a nasty infrastructure knot. But now I'm reading some SEO blogs that warn against serving the canonical link on the "original" URL. How could that be a problem? For smaller websites without many infrastructure assets to work with, surely it's an easy way to say "don't let any backlinks mess with this URL by adding query strings, playing around with case, double slashes, etc, etc.)
Are these articles "crying wolf" when canonical link problems rare and most everyone is having smooth sailing? Or are lots of people really getting into trouble with their canonical links?