Is there any harm in having a canonical tag on a page if there are no alternative urls?
(Not that I would do this intentionally, but I mean if one is included in error)
2:36 pm on Mar 18, 2011 (gmt 0)
I do this intentionally with no problem - and I think it's a best practice (not at all an error) to return a canonical link, even if it uses the exact URL that was originally requested.
Why would I think it's a best practice? Because there are too many possibilities [webmasterworld.com] for non-canoncial URLs and I can't always foresee or control all of them.
Besides, even if the requested url IS the canonical url, what's the harm in confirming that? The canonical element is not EQUAL to a 301 redirect, but it does get treated "like" a 301 redirect when it comes to assigning link equity. So you are not setting up any infinite loop possibility, the way a true 301 redirect would.
For reference, here's a video from Matt Cutts [youtube.com] announcing the canonical element in 2009. At 17 minutes in, Matt confirms that it's fine for a URL to "point to itself" using the canonical element.
[edited by: tedster at 3:04 pm (utc) on Mar 18, 2011]
2:54 pm on Mar 18, 2011 (gmt 0)
Thanks Tedster. That was exactly my thought process; I try to analyses all the possible url paths but with 1000's of pages it's a big task.
3:26 pm on Mar 18, 2011 (gmt 0)
I've been thinking about implementing this as best practice as well. We recently had a misconfigured firewall and a dev version of the site was accidentally exposed to the outside world. Googlebot started crawling it pretty heavily. So bob-dev.example.com was serving a duplicate site to www.example.com
We fixed the firewall and thought about other layers of solutions. We considered changing robots.txt file for dev servers to disallow: / but rejected that because we don't want to accidentally get that setting enabled to the live site. Putting a canonical tag on every page would point googlebot back to the real thing if it did start crawling a copy site and it doesn't seem as risky as a custom robots.txt file.
3:33 pm on Mar 18, 2011 (gmt 0)
I've been doing this for a while on multiple sites; suffered no ill effects that I can see.
3:59 pm on Mar 18, 2011 (gmt 0)
Would you put canonical's on pages where you also have 'noindex' ?
7:22 pm on Mar 18, 2011 (gmt 0)
Can't hurt - so I would.
7:49 pm on Mar 18, 2011 (gmt 0)
So you are not setting up any infinite loop possibility, the way a true 301 redirect would
Not for the visitor, but it seems that you could create what Google would see as a cyclic redirect with incorrectly set canonical tags: [webmasterworld.com ]
I am not sure what this would do with regards to indexing of the offending page and positioning of such page in SERPs though.
12:02 am on Mar 19, 2011 (gmt 0)
I have a question - does anyone suspect whether Google, on encountering canonical in head section, does not bother with the rest of the page, or do you think it still goes through the rest of the page with canonical tag and harvests on-page links etc?
I am wondering because if they treat it internally as 301 redirect, there could be a possibility that they do not bother past head section.
I know MC said they reserve the right not to follow canonical which is "strong hint" rather than a directive, but from various reports on testing this tag, it appears they would canonicalise a page with a totally different content to URL set in canonical tag.
2:53 am on Mar 19, 2011 (gmt 0)
I'm almost 100% certain that they index all the html - and then the algo decides what to do with it.
I agree that they are much too rigid in applying the canonical as a "strong hint", but I have seen a few cases where Google ignored the canonical tag and indexed the content at the original URL. Now that cross-domain canonicals are also supported, I assume that such indexing would be more common in those cases.