| This 38 message thread spans 2 pages: < < 38 ( 1  ) || |
|Why Are There "Canonical Disasters" - Is Google Messing Up?|
| 3:50 am on Aug 19, 2010 (gmt 0)|
In the past week or so I've read several accounts around the web of so-called "canonical disasters." We also had several recent threads here describing problems that seemed to be caused by the canonical link - including one where there was an accidental ".com.com" so the canonical tag read example.com.com. And com.com resolves any wild card subdomains, so that messed with this members indexing and ranking.
I don't get it. This tag was introduced a year and a half ago and it should be relatively straightforward, if Google follows through on what they originally said:
|Is it okay if the canonical is not an exact duplicate of the content? |
We allow slight differences, e.g., in the sort order of a table of products. We also recognize that we may crawl the canonical and the duplicate pages at different points in time, so we may occasionally see different versions of your content. All of that is okay with us.
Is rel="canonical" a hint or a directive?
It's a hint that we honor strongly.
So what kind of "slight difference" allows a deep internal page to be canonicalized to the home page - with totally different content? Or to a page on com.com? Google is supposed to IGNORE the tag - treating it as a "strong hint" but only if the content has only slight differences, right?
I've used the canonical link tag with no apparent problems, and in some cases it put an easy band-aid on a nasty infrastructure knot. But now I'm reading some SEO blogs that warn against serving the canonical link on the "original" URL. How could that be a problem? For smaller websites without many infrastructure assets to work with, surely it's an easy way to say "don't let any backlinks mess with this URL by adding query strings, playing around with case, double slashes, etc, etc.)
Are these articles "crying wolf" when canonical link problems rare and most everyone is having smooth sailing? Or are lots of people really getting into trouble with their canonical links?
| 4:50 am on Aug 24, 2010 (gmt 0)|
|That has nothing to do with my point which was webmasters can make canonical mistakes more than one way... |
mine was also an implied reference back to Sgt_Kickaxe's point in the post preceding yours:
|search engines do not trust webmaster input |
the mistaken signals that directly affect the visitor, such as 301's and internal links, should be treated/trusted differently than mistaken signals that are purely signals for the SE, such as the meta rel=canonical.
it's easy to ignore a meta element.
it's essentially impossible to ignore a 301 since there is no other possibly overriding signal.
| 6:29 am on Aug 24, 2010 (gmt 0)|
Timely thread, because I've began the lengthy process of manually adding the canonical tag to every single static page of my website...
I use complicated httpd.conf rules to rewrite/301 to cruft free URLs. I also implemented measures to 301 double, triple, quadruple etc etc slashes. This is all in addition to the non-www to www version 301 that is permanently in place...
Even so, I've recently discovered a way that content can be resolved with a .html fudge... i.e. the same cruft free content can be displayed if I type...
I could probably figure out a way to fix this by further complicating my already complex and resource intensive httpd.conf directives and rules... but in the end, I opted to just make it easy on my server by adding the canonical thingy to each page...
... but it's time consuming, because for every page I'm adding it to, I'm checking what I typed, re-checking, and then checking again.
What a mess the Internet is.
| 7:29 am on Aug 24, 2010 (gmt 0)|
After many discussions, I feel a lot of canonical link tag problems are coming from two factors that compound each other:
1. Some webmasters are overly concerned about duplicate content and try to avoid any "duplicate content penalty". The essential points that are being missed here:
~~~~~~~a) there is no "duplicate content penalty"
~~~~~~~b) duplicate content is not just a few sentences from one page, repeated on another.
2) Google is NOT requiring that the canonical URL show at most "slight differences" from the URL used to access the page. They seem to be applying the canonical hint far too often, even when the "strong hint" given by the webmaster is ludicrous.
In my opinion, what should have been a simple and elegant solution has been turned into a potential pitfall. That's a damn shame, and Google can fix it by dialing back on how they interpret "slight difference".
| 8:08 am on Aug 24, 2010 (gmt 0)|
@tedster Yes we saw a big drop in rankings and loss of traffic
It could of course been something else.
| 11:15 pm on Aug 24, 2010 (gmt 0)|
I have to agree with you there. I think a lot of people over analyze their websites a bit and do not realize that the algo gives sites some play on duplicate content. If you have one page or similar pages with 2 or 3 urls, Google will figure out in due time if your site has a clear linking structure.
| 10:05 pm on Aug 30, 2010 (gmt 0)|
I think "Canonical Catastrophe" has a better ring :)
| 8:12 am on Sep 2, 2010 (gmt 0)|
|HTML suggestions |
We didn't detect any content issues with your site.
Yes, you can cure the situation and undo a canonical disaster.
Had a major mishap that I didn't notice on a site that caused thousands of canonical issues.
It took lots of redirects and canonical meta tags to sort it out, plus almost 2 months of waiting.
After putting all the pieces in place I watched it drop from about 2K problems to 900+, then 400+, then much to my dismay back up to 900+, then down to 200+ errors, then 74, then ZERO.
TBH, there were probably tens of thousands of errors based on the mistake made, but Google's WMT doesn't seem to sort out the HTML errors terribly fast on that large of a scale so by the time it should have found and report more errors they were already fixed.
I did nothing during that period after implementing the changes, just watching Google do it's thing, and Google does it's thing very slowly, be patient.
| 8:59 am on Sep 2, 2010 (gmt 0)|
I have added a new page to a hand-codded mini-site - I copied home page (because of heading, footer, nav) and changed copy content and meta. Completely missed canonical tag which has been copied from the home page and was hence pointing to home page. Noticed this some time later and corrected canonical.
However, in the meantime Google crawled this new page and if I look at its cache, the cache for this page displays home page. So Google took canonical as more than just "strong hint" despite page content differences.
I am waiting now to see if the page will get cached with its own true content and how long it will take to recover.
| This 38 message thread spans 2 pages: < < 38 ( 1  ) |