homepage Welcome to WebmasterWorld Guest from 54.226.18.74
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Each page as its own canonical?
shaunm




msg:4520303
 11:05 am on Nov 19, 2012 (gmt 0)

Hi All!

How does using the same page as its own canonical sounds to you?

For instance:
example.com/product/p1

And the canonical tag is
<link rel="canonical" href="http://example.com/product/p1"/>

I am seeing this in almost all the pages in the Website that I recently got to work with.

Is there any particular reason for using 'each page as it's own canonical'? It sounds so weird to me :D


Thanks for your help! :)

 

kheadley




msg:4520365
 2:07 pm on Nov 19, 2012 (gmt 0)

One of the positives of doing this is for the prevention of duplication. There was a post done (http://dejanseo.com.au/hijacked/) on hijacking search results and that one of the best preventions was to use rel-canonical to stop people duplicating whole sites and posting them live.

Interesting read

klark0




msg:4520367
 2:17 pm on Nov 19, 2012 (gmt 0)

Not weird at all. Wordpress SEO plugins do it by default. And as headley pointed out, it's valuable when a page gets copied or scraped wholesale.

g1smd




msg:4520398
 3:54 pm on Nov 19, 2012 (gmt 0)

<link rel="canonical" href="http://example.com/product/p1"/>

Good insurance.

When the page is requested as example.com/product/p1?somerandomjunk due to linking outside of your control the canonical metadata takes care of things.

deadsea




msg:4520403
 4:08 pm on Nov 19, 2012 (gmt 0)

I use this technique on my site. It really helps on development servers that end up getting crawled by googlebot. If I have a canonical pointing to the live site on every page, then Google never starts sending traffic to staging servers. (They are usually behind a firewall, too, but sometimes we need to make them accessible for beta testers and such.)

It also helps with parameters that get added for no reason. So if http://example.com/page.html has itself as a canonical, and googlebot finds it as http://example.com/page.html?session=xyzzy
the canonical takes care of the problem without your server even having to realize that there is a useless parameter on the url.

aristotle




msg:4520531
 8:46 pm on Nov 19, 2012 (gmt 0)

A couple of my sites have some old articles that haven't been touched in any way for at least five years. Since I haven't touched them, they don't have these self-referencing canonical tags. Over the years most of them have been scraped and republished at least a dozen times. Some of them have been republished on blogspot.com and wordpress.com, and a look at their source codes reveals that these self-reference themselves as the canonicals, evidently because the wordpress and blogspot software inserts the tags automatically. But despite this, Google still gives the top rankings to my pages, apparently having long ago marked them as the originals. So in this case, the false canonical tags on the scraped copies didn't trick the Google algorithm.

Most likely there are tens of thousands of old articles on the web that don't have these canonical tags. but Google apparently realizes this and takes it into account when trying to determine which pages are original and which are scraped copies.

shaunm




msg:4520650
 6:40 am on Nov 20, 2012 (gmt 0)

@kheadley, klark0

Thank you so much! I was thinking about all the possibilites of using a page as its own canonical, and I never thought about the scrapers and the out of control parameters.

@g1smd

Thanks :)

Good insurance.

When the page is requested as example.com/product/p1?somerandomjunk due to linking outside of your control the canonical metadata takes care of things.


Could you please tell me under what circumstances it goes out of control? Also, what if example.com/product/p1?somerandomjunk also have its own canonical tag?

To make it clear I only have to use <link rel="canonical" href="http://example.com/product/p1"/> on the page example.com/product/p1?somerandomjunk right? What if the page example.com/product/p1?somerandomjunk has a canonical as <link rel="canonical" href="http://example.com/product/p1?somerandomjunk"/>?

I am asking this because, as I mentioned above each page acts as its own canonical as per how the CMS is customized or how it is done in the dev process. So in that case, it makes this worse right? I am confused. Thanks

@deadsea
Thanks a lot!

As mentioned in my above post, what if http://example.com/page.html?session=xyzzy has its own canonical as well?

Also, how does Google finds it as something else than the actual page?

Thanks,


@aristotle
Thank you for answering on my post. It's really a great learning being part of this community.

Also, do you suggest me to remove all that canonical tags?


Thanks again.

Robert Charlton




msg:4520670
 9:09 am on Nov 20, 2012 (gmt 0)

I am asking this because, as I mentioned above each page acts as its own canonical as per how the CMS is customized or how it is done in the dev process. So in that case, it makes this worse right?

Yes, it would make certain problems much worse. I've run into such a "feature" on a CMS. I noticed that it was using localhost as the hostname when we were testing, and I was horrified at the implications.

Fortunately, there was an option not to use the auto-generated canonical tag... but to enter it manually, and that's what we did. I've got it on my list to discuss this further with the developer of the url module. We spoke briefly when we put the site online, but I don't think he fully got it.

g1smd




msg:4520674
 9:32 am on Nov 20, 2012 (gmt 0)

The canonical tag must point to the canonical URL for that content.

When a page is requested at incorrect URL or URL with appended junk, the canonical tag must not point to the requested URL. It must point to the canonical URL.

shaunm




msg:4520680
 10:31 am on Nov 20, 2012 (gmt 0)

Thank you Bob :) and Thank you @g1smd

g1smd




msg:4520685
 10:41 am on Nov 20, 2012 (gmt 0)

I store the canonical page URL as an entry in the database, so it gets pulled at the same time the page content is pulled.

Sometimes the canonical page URL is built from an article or product number (also stored in database) and the page title element (with punctuation stripped, changed to lower case, spaces to hyphens, etc).

deadsea




msg:4520701
 11:51 am on Nov 20, 2012 (gmt 0)

http://example.com/product/p1?somerandomjunk should certainly have http://example.com/product/p1 as the canonical, or it might make things worse.

On my site http://example.com/page.html?session=xyzzy would canonicalize to http://example.com/page.html

g1smd




msg:4520702
 11:55 am on Nov 20, 2012 (gmt 0)

Remember too that the canonical URL for the root of the domain should end with a trailing slash.

URLs for folders or the index page in a folder should end with a trailing slash.

URLs for the index page (either root or in a folder) should not include the index file name or the extension.

URLs for pages should not end with a trailing slash and the extension is optional.

Consider too that if parameters are in use, the order is important. This should always be the same.

When there are multiple possible URLs for the same content, Google appears to prefer the shortest one.

Additionally, in order of preference:
www.example.com/p456
www.example.com/page-456.php
www.example.com/page?id=456
www.example.com/page.php?id=456

In particular use
www.example.com/?id=678
and NOT
www.example.com/index.php?id=678

[edited by: g1smd at 1:02 pm (utc) on Nov 20, 2012]

menntarra 34




msg:4520708
 12:24 pm on Nov 20, 2012 (gmt 0)

well, i prefer to use
link rel="prev" and
link rel="next"

on paginated pages.

shaunm




msg:4520712
 12:38 pm on Nov 20, 2012 (gmt 0)

@deadsea
Thanks!

@g1smd
Thanks!
URLs for the index page (either root or in a folder) should not include the index file name.
Oh is it?

To make myself clear
If there is a page such example.com/product/index.aspx
I only have to use the canonical 'example.com/product/' right? Also is the extension an optional thing?


Cheers!

g1smd




msg:4520713
 12:45 pm on Nov 20, 2012 (gmt 0)

For index page, www.example.com/product/ is the canonical URL, omitting both page name and extension.

Make sure that is for form that you link to within your site as well as the form noted in the rel=canonical meta data on the page.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved