Welcome to WebmasterWorld Guest from 54.205.141.4

Forum Moderators: ergophobe

Message Too Old, No Replies

canonical tag on PDF

     
3:22 pm on Nov 2, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Aug 10, 2010
posts: 45
votes: 0


Hi,

I know there is a post on Google webmaster forums on adding canonical tags to HTTP headers for PDF's but I have been told this is only a solution for download of something which is also on the site.

I want to include some downloadable PDF white papers on our site that is already been published on the Prof's own personal website.

I don't want to get hit with duplicate content issues.

can anyone suggest anything?
3:51 pm on Nov 2, 2012 (gmt 0)

Moderator This Forum

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 25, 2002
posts:8438
votes: 210


I think the header solution can be used in your situation - for example Google says one use case is for when you are using a CDN and the hosts are different. Headers should work for you.

You could also have the PDF listing page be no-index and exclude the PDFs with robots.txt
1:42 am on Nov 3, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10749
votes: 43


you can use the link rel canonical header for pdfs and you can use the canonical cross-domains.

Official Google Webmaster Central Blog: Supporting rel="canonical" HTTP Headers:
http://googlewebmastercentral.blogspot.com/2011/06/supporting-relcanonical-http-headers.html [googlewebmastercentral.blogspot.com]

Official Google Webmaster Central Blog: Handling legitimate cross-domain content duplication:
http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html [googlewebmastercentral.blogspot.com]

[edited by: phranque at 1:43 am (utc) on Nov 3, 2012]

1:43 am on Nov 3, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10749
votes: 43


no-index and exclude the PDFs with robots.txt

the robots exclusion prevents the noindex from being seen.
7:18 pm on Nov 4, 2012 (gmt 0)

Moderator This Forum

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 25, 2002
posts:8438
votes: 210


the robots exclusion prevents the noindex from being seen.


Read my comment again. Suggestion was to

- no-index the PDF *index* pages (i.e. the lists of PDFs or whatever s/he has). This is to keep titles/snippets out of the SERPS, but the page can get crawled.

- robots.txt on the PDFs themselves so they don't get crawled at all.
10:45 pm on Nov 4, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10749
votes: 43


my bad - i read that too fast - missed a couple words.
12:03 pm on Nov 5, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Aug 10, 2010
posts: 45
votes: 0


Thanks :)
4:19 pm on Nov 5, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Oct 26, 2002
posts:3295
votes: 9


I want to include some downloadable PDF white papers on our site that is already been published on the Prof's own personal website.

Why do you see a need to do this? Why not just link to the professor's website?
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members