Search Engines Agree on "Canonical tag" - Google Search and SEO forum at WebmasterWorld - WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Search Engines Agree on "Canonical tag"

«
1
2
3
4
5
»

youfoundjake

3:44 am on Feb 13, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Not sure where to put this, but since google search is moderated, a mod will put where necessary.

Today it was annouced that the 3 big search engines have come up with a new tag to help with canoncial issues.
Announcements:
[googlewebmastercentral.blogspot.com ]
[ysearchblog.com ]
[blogs.msdn.com ]

Using the new canonical tag
Specify the canonical version using a tag in the head section of the page as follows:
<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish"/>
That’s it!
You can only use the tag on pages within a single site (subdomains and subfolders are fine).
You can use relative or absolute links, but the search engines recommend absolute links.
This tag will operate in a similar way to a 301 redirect for all URLs that display the page with this tag.
Links to all URLs will be consolidated to the one specified as canonical.
Search engines will consider this URL a “strong hint” as to the one to crawl and index.

< See also Canonical Tag Results: Share the stories - Positive / Negative / No Impact [webmasterworld.com] >

[edited by: tedster at 6:08 pm (utc) on April 2, 2009]

fsmobilez

5:52 pm on Feb 17, 2009 (gmt 0)

10+ Year Member

And yes update for my questions i can add separate link tag code in header for category and posts.

Receptional Andy

5:55 pm on Feb 17, 2009 (gmt 0)

Incidentally, did anyone else notice that Google's own example "trusted tester" (given in their blog) is showing both the canonical and non-canonical URL in search results? I admit, my confidence in this feature took a bit of hit on seeing that!

mcavic

7:55 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

This is going to go right over the head of 99% of webmasters.

What's so hard about it? You have multiple URLs showing similar content. You want the users to have access to the multiple URLs, but you want search engines only to index one of them. So you stick a tag on each page pointing the SE's to the official URL.

tedster

8:19 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

So should i have remove the robot rule and add this new tag

I would add the new tag as a backup (it will help in other types of canonical issues, too) but definitely would NOT remove the existing robots.txt rule or .htaccess rules that are already in place.

For one thing, you will save google some spidering cycles. For a second, the canonical tag is taken as a "hint" and not as an ironclad rule. Your own server configuration is still a more solid solution than this new tag - especially because it is still new and a bit of an unknown quantity in practice.

pageoneresults

8:41 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

What's so hard about it? You have multiple URLs showing similar content. You want the users to have access to the multiple URLs, but you want search engines only to index one of them. So you stick a tag on each page pointing the SE's to the official URL.

If we can get everyone participating who has questions to understand that simplistic explanation, maybe we can cut the questions from 50 to 10 or so? :)

signor_john

8:45 pm on Feb 17, 2009 (gmt 0)

Sorry, but it's a lot more fun to speculate about the search engines' hidden motives for providing such a useful tool.

Let's see: The big three search engines came up with this together. Dang! I've figured it out. They're planning a merger! Quick, everybody, to the GOOG, Microsoft, and Yahoo corporate forums! :-)

[edited by: signor_john at 8:48 pm (utc) on Feb. 17, 2009]

Gemini23

8:46 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

"What's so hard about it? You have multiple URLs showing SIMILAR content....." I have a website about Green Widgets... most of my content is 'similar' :) but each page is also unqiue, or do we mean essentially "the 'SAME' content"?

[edited by: Gemini23 at 8:48 pm (utc) on Feb. 17, 2009]

mcavic

9:29 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

You have multiple URLs showing SIMILAR content

By similar, I meant redundant as far as the search engine is concerned. In other words, you know which pages are useful to the SE, and which ones are essentially duplicate content.

drall

9:41 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

So if I have

http://www.example.com/sub/sub/index.html

and want to make sure this is the one the engines choose instead of

http://www.example.com/sub/sub/

then I insert this

<link rel="canonical" href="http://www.example.com/sub/sub/index.html"/>

into my pages?

aakk9999

11:51 pm on Feb 17, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

So, if I have URLs

http://www.example.com/
http://www.example.com/Default.aspx

which have identical content and I want that the http://www.example.com/ is the page chosen by the search engines, would I include forward slash at the end of the URL in the new tag? I.e., would the new tag which I would put in the head section of the http://www.example.com/Default.aspx page be:

a) <link rel="canonical" href="http://www.example.com/" />

or

b) <link rel="canonical" href="http://www.example.com" />

Thanks.

tedster

1:01 am on Feb 18, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

That's what I'd choose, aakk9999.

I've seen poorly executed (keyword fluffing) url rewrite schemes that can also benefit from this. For instance, the server might deliver the same content for these two urls:

example.com/21/categoryname/location/page
example.com/21/anyoldstring/location/page

Also some rewrites are not particular about the order of the virtual folder names, and now they won't need to be:

example.com/21/categoryname/location/page
example.com/21/location/categoryname/page

mcavic

1:39 am on Feb 18, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

<link rel="canonical" href="http://www.example.com/sub/sub/index.html"/>
a) <link rel="canonical" href="http://www.example.com/" />

Yes to both.

drall

3:34 am on Feb 18, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Thanks mcavic.

fsmobilez

10:45 am on Feb 18, 2009 (gmt 0)

10+ Year Member

Thanks tedster for answering to my 2nd question

but what about first question how can i implent this code in my dynamic site

Let say my site urls are like this

www.example.com/category/post.php?post_id=1&cat=1
www.example.com/category/post.php?post_id=2&cat=1
www.example.com/category/post.php?post_id=1&cat=2
www.example.com/category/post.php?post_id=2&cat=2
www.example.com/category/post.php?post_id=1&cat=100

and so on
www.example.com/category/post.php?category.php?cat_id=1
www.example.com/category/post.php?category.php?cat_id=2

and so on

well my site generate dynamic links like for post it generates links like this

www.example.com/category/post.php?post_id=2&cat=1&mostviews
www.example.com/category/post.php?post_id=2&cat=1&random
www.example.com/category/post.php?post_id=2&cat=1&mostemailed

and same for categories

so will u plz help me adding this code in header

should i have to add code like this

<link rel="canonical" href="http://www.example.com/category" />

Or how , will u plz post me the exact code i can add in my site header

tedster

4:43 pm on Feb 18, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

It sounds like, for each those three urls, you would want this tag:

<link rel="canonical" href="htttp://www.example.com/category/post.php?post_id=2&cat=1">

... the href attribute's value begins with the front part of url but the unwanted parameter is dropped. I say "sounds like" because I'm naturally not familiar with your site, so I can't be sure that these extra parameters are actually duplicating the same content - that would be your job or your team's job to ensure.

By the way, I thought you were blocking those urls in robots.txt anyway. If so, any problem should already be handled and googlebot won't be requesting those urls anyway.

kentdavidson

12:03 am on Feb 19, 2009 (gmt 0)

10+ Year Member

Do we all agree that fixing canonical issues "for real" is better than using this tag? I think this is bad news for anyone who has fixed all canonical issues on their site. More SE exposure for the lazy competition.

@Tonearm, @pageoneresults, @wheel: You can never "fix" canonical for real for a site. Take any site URL, and add

http://www.example.com/?utm_source=whatever&ovcpn=whatever

And it delivers the same content as

http://www.example.com/

Many, many, PPC and SEO tracking solutions use this type of tracking, and the search engines *may* use those duplicate inbound links as attempts at duplicate content, as it doesn't perturb the page (in most cases.)

The technical solution which works is the solution that the 3 engines came up with: Put on the page the "parameters" which the page actually pays attention to which outputs different content. (e.g. ?article=54)

The solution is simple and easy to implement.

The only way to truly "fix canonical issues" for real (as you like) is to check all incoming query string parameters, and if any invalid ones are found, issue a 404, or a 3xx and redirect to the correct page.

However, that would make most of my search marketing and SEO customers cringe as it kills most of how they track things, including inbound referrers (depending on the 3xx code.)

Cheers.

Edited: Reply to page one

tootricky

12:21 pm on Feb 19, 2009 (gmt 0)

10+ Year Member

So has anybody else started any tests yet? Google has yet to index the changes to my pages so I'm still waiting for some action.

As far as I can see this technique could be used to spam the hell out of bookmarking sites. Hmmmm I might test that too. :P

fsmobilez

4:20 pm on Feb 19, 2009 (gmt 0)

10+ Year Member

Ok tedster thanks a lot for ur help

i will go for robots rule only.

wheel

4:58 pm on Feb 19, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

@wheel: You can never "fix" canonical for real for a site.

I don't have an canonical problems I need to fix. The search engines may have problems. Thus, they should fix them, not me.

Really, the entire issue is a non-problem. Hardly anyone has these kinds of problems that can't be fixed through either rearanging your site, using robots or htaccess files. And most sites don't even have the need for even any of that. Anyone outside that box, well, we're looking at one in a million. It's a hammer looking for a nail.

Nevertheless, I'm not disputing the technical aspect of it, anymore than I dispute the technical aspect of the nofollow tag. I am disputing that this is not what webmasters want to be doing long term. It's good for the SE's at the potential long term expense of us. We're (well, not me - it's youse guys) are building the next MS monopoly with this kind of behavior.

All that being said, I somehow doubt that this is actually going to help anyone's ranking noticeably, or make enough of a difference that couldn't be 'fixed' by some more backlinks. There's no way this is a panacea for helping people to rank. And if it's not, what the heck, you need more work to do or something :)?

wheel

5:04 pm on Feb 19, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

And as too tricky notes, I bet there's some very interesting possibilities here. I'm not super technical, but this smells like potential worse problems could be created (well, not for me, for the SE's) by the darker side. Then what are the SE's going to do? I know...change the intepretation/implementation midstream, to suit their best interests, without regard to your best interests.

Haven't we been down this road already?

pageoneresults

5:18 pm on Feb 19, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

You can never "fix" canonical for real for a site. Take any site URL, and add...
http://www.example.com/?utm_source=whatever&ovcpn=whatever

Wouldn't you just do...

<link rel="canonical" href="http://www.example.com/" />

Ya, there are all sorts of neat things you can add onto a URI string and still have it resolve to the destination page. I've seen all sorts of tricks in this area over the years. ;)

I'll keep me eye on its usage and see how others are doing in the process. It's still way too early for me to jump on the bandwagon. We typically manage ours at the server level so it is not relevant for us. Although I may find myself looking at its use when I need to do something quickly while we get the proper solution in place. :)

Canonical Name
The actual name of a resource.

kentdavidson

7:37 pm on Feb 19, 2009 (gmt 0)

10+ Year Member

@wheel said:

I don't have an canonical problems I need to fix. The search engines may have problems.

I completely agree, actually. I think the sensible approach is to continue going about your business, and consider this option only when necessary. (And by, "when necessary", I mean never.)

And again, from a technical standpoint, the engines can use cryptographic checksums (e.g. md5, sha1) on page content to automatically deduplicate content.

Also, they should be "smart" enough to know that:

http://www.example.com/Default.asp =
http://www.example.com/default.asp =
http://www.example.com/ =
http://www.example.com/DeFaUlT.AsP =

I mean, what does 6 billion in profit get you these days?

Of course, if you have dates/times, or random images, it gets trickier.

Receptional Andy

7:47 pm on Feb 19, 2009 (gmt 0)

I mean, what does 6 billion in profit get you these days?

It buys you sophisticated systems for filtering duplicates out of results.

If you serve the same content via multiple URLs, search engines will choose which one to display in results. Sometimes, this results in a choice that webmasters don't like - particularly if search engines don't consolidate things like link popularity to a single URL. So, a site owner can end up with several weak URLs instead of one strong one - so-called duplicate content problems.

Search engines don't really care - it's very rare to see duplicate content in a single result set these days. This option (as our redirects) are a way for you to choose a preferred URL, as opposed to letting search engines do it for you - sometimes to your disadvantage. This isn't you helping them out, but the reverse.

Whether this element actually works or is a good implementation is a different question, but this isn't like nofollow.

wheel

7:58 pm on Feb 19, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

but this isn't like nofollow.

It's exactly like nofollow. Nofollow was for your benefit too. As tedster pointed out in another thread to debunk conspiracy theories :) there's old threads around here where people used to talk about how great nofollow would cut out spam. When was the last time someone mentioned spam and nofollow in the same breath? Nofollow is discussed primarily around paid links and pr sculpting now.

I appreciate I sound like tfh guy and appreciate there's some validity to that perception. But I disagree that this tag can't be distorted exactly like nofollow was. The initial characteristics are identical.

In any event, I seem to be able to rank without using any of this stuff, even on sites that have all sorts of potential canonical issues and duplicate content.

Receptional Andy

9:01 pm on Feb 19, 2009 (gmt 0)

It's exactly like nofollow

I understand where you're coming from. I have a distaste for any of the proprietary elements introduced by search engines. And for sure, SEO-types will try to figure out a way to (ab)use any element that impacts on search engines.

But for me, this is intended to fix a webmaster's problem - whereas nofollow is to fix a search engine's problem.

Tonearm

6:56 pm on Feb 20, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

The only way to truly "fix canonical issues" for real (as you like) is to check all incoming query string parameters, and if any invalid ones are found, issue a 404, or a 3xx and redirect to the correct page.
However, that would make most of my search marketing and SEO customers cringe as it kills most of how they track things, including inbound referrers (depending on the 3xx code.)

Just check the user agent. I don't think that would be considered cloaking would it?

Receptional Andy

8:11 pm on Feb 20, 2009 (gmt 0)

I don't imagine you would run into problems if you just redirected crawlers with that. But it's easy enough to both redirect and retain tracking - that's what cookies are for. Although not all analytics systems will support this (enterprise ones should!).

Doing say makes tracking more effective too - no bookmarked tracking URLs to skew statistics. One item of content per URL is a good model for reasons other than just search engines ;)

g1smd

2:23 am on Feb 23, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

www.example.com/category/post.php?post_id=1&cat=1

Don't forget that this URL would also work:

www.example.com/category/post.php?cat=1&post_id=1

g1smd

2:25 am on Feb 23, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I see the main use of this tag being:

1. for removing session IDs and affiliate IDs for bots, and

2. removing stray parameters, and forcing a strict order for wanted parameters.

I would still use redirects in .htaccess to fix the majority of canonical issues.

lfgoal

2:13 pm on Feb 23, 2009 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Could this be used on a blog hosted at blogspot. Years ago, I began a blog and when I began to acquire links I asked linkers to use www (www.somesite.blogspot.com instead of somesite.blogspot.com). Now, of course, most of the spontaneous links to the site are in the non-www format and I sometimes wonder if I goofed myself by doing this, though the blog does quite well in its niche for all its keywords.

This 137 message thread spans 5 pages: 137

«
1
2
3
4
5
»