Welcome to WebmasterWorld Guest from 54.196.175.173

Message Too Old, No Replies

Canonical issues re dupe content in PDFs?

     

Hannahness

2:02 pm on Oct 14, 2010 (gmt 0)

5+ Year Member



I'm currently working on a site that contains landing pages for articles with an abstract for each article and links to the full articles, which are PDFs which have been uploaded to the site. The abstracts are generally the first paragraphs of the full article. Since PDFs are indexed by Google, I assumed that dup content issues are as relevant for them as any other web page - is this incorrect? I'm thinking of implementing a canonical link element on the articles themselves. What are your thoughts on this?

Thanks, all!

tedster

9:18 pm on Oct 14, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



To be clear, do the html contain only an abstract and not the full article?

Hannahness

9:03 am on Oct 15, 2010 (gmt 0)

5+ Year Member



Exactly.

g1smd

10:03 am on Oct 15, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I don't think it is a major concern.

However, I don't like visitors arriving directly at a PDF because there is no obvious navigation back to the rest of the site. They view one file and leave.

I will robots.txt disallow the PDF version URLs (all the PDFs will be in one folder, or folder tree), post all the text as a HTML page and prominently link to the PDF version from there.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month