homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Canonical issues re dupe content in PDFs?

 2:02 pm on Oct 14, 2010 (gmt 0)

I'm currently working on a site that contains landing pages for articles with an abstract for each article and links to the full articles, which are PDFs which have been uploaded to the site. The abstracts are generally the first paragraphs of the full article. Since PDFs are indexed by Google, I assumed that dup content issues are as relevant for them as any other web page - is this incorrect? I'm thinking of implementing a canonical link element on the articles themselves. What are your thoughts on this?

Thanks, all!



 9:18 pm on Oct 14, 2010 (gmt 0)

To be clear, do the html contain only an abstract and not the full article?


 9:03 am on Oct 15, 2010 (gmt 0)



 10:03 am on Oct 15, 2010 (gmt 0)

I don't think it is a major concern.

However, I don't like visitors arriving directly at a PDF because there is no obvious navigation back to the rest of the site. They view one file and leave.

I will robots.txt disallow the PDF version URLs (all the PDFs will be in one folder, or folder tree), post all the text as a HTML page and prominently link to the PDF version from there.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved