I know there is a post on Google webmaster forums on adding canonical tags to HTTP headers for PDF's but I have been told this is only a solution for download of something which is also on the site.
I want to include some downloadable PDF white papers on our site that is already been published on the Prof's own personal website.
I don't want to get hit with duplicate content issues.
I think the header solution can be used in your situation - for example Google says one use case is for when you are using a CDN and the hosts are different. Headers should work for you.
You could also have the PDF listing page be no-index and exclude the PDFs with robots.txt
the robots exclusion prevents the noindex from being seen.
Read my comment again. Suggestion was to
- no-index the PDF *index* pages (i.e. the lists of PDFs or whatever s/he has). This is to keep titles/snippets out of the SERPS, but the page can get crawled.
- robots.txt on the PDFs themselves so they don't get crawled at all.