Google spiders PDF files, you only need to see from the number of these that appear in the results to see that Google likes them, so there is definitely value in these.
So you think it would be worth getting some PDF's built.
Does it read the content of the PDF? If so, does it possibly count this as duplicate content?
|I'm sure i remember reading and speaking with people about how Google rated sites better that offered informative PDF files |
That's a new one to me. Why would Google prefer a pdf to a html page?
I'm curious about PDFs too. I've just sent for the software so I can write my own. I think they are excellent for information, diagrams or whatever that need to print out in an exact size.
It makes more sense to build PDFs because they meet the needs of you and your visitors than just to rate better with Google.
I am curious though if you could be penalized if you put duplicate information on a PDF. Sometimes it's appropriate to give visitors the option of being able to save or print out the PDF version. Some visitors will just want to glance at the HTML version but others who want to file the info for future referance will prefer the PDF.
> I am curious though if you could be penalized
> if you put duplicate information on a PDF.
There is at least one instance where it's reasonably common to have the html page and, although formatted slightly different, nearly the same textual info in multiple formats for download: DOC, PDF, DOSTXT.
The TXT is quite helpful to the lower speed dialup visitors.
The for download files have no menu bar, no duplication of graphics from html page, no html coding, an extra url or two reminding the printee from whence it came, and some other minor changes not relevant to SE's but useful to the visitors and the organization.
This practice extends back near on two years and if it's ever caused a ding, it hasn't caused an obvious one.
Gonna give it a try.
|Why would Google prefer a pdf to a html page? |
In my opinion white papers stand a less chance of being spammy. It is raw content that has not been tweaked to improve rank or positioning. Raw content.
As far as a duplicate content slap on the hand goes - For PDFs I don’t think that this applies either because duplicate content is usually a method on the HTML part to steal traffic from a competitor, increase clicks on advertisements, etc… PDF is just that raw content meant for one purpose and one purpose only ‘usually’.
I have a client that has thousands of original text [white papers] in PDF and a lot of those documents rank very well. So do their HTML pages though…
As long as your website content matches the content theme within your PDF documents you stand a better chance at ranking them well. On the flipside it is important to optimize your PDF for conversion and getting the reader to click through to your website. New PDFs we are developing have active forms in them and I am curious how they will generate leads/contacts.
Usability and why I hate PDFs!
PDFs so not user friendly! Users generally hate to download a PDF because it uses an external application to view the document. Usability experts agree that it interrupts the users flow of information.
Interesting thread… Has got me thinking now. ;)
Is it possible to do a noindex nofollow tag on PDFs? I'd prefer to have visitors come in thorough an HTML page on my site then open the PDF. If they just find the PDF thought a search engine they would be less likely to find and bookmark the site and there would be no ad click or affiliate revenue. I put a lot of work into this site and would like to get a little back.
|Is it possible to do a noindex nofollow tag on PDFs? |
If you don't want your PDF files indexed you can protect a directory via htaccess, or robots.txt.
If you are using PDFs to gain exposure in the SERPS, a well optimized PDF that is well branded and offers interactive links back to your main website will do the trick.
As far as noindex nofollow I don't think it is possible or not that I can see in the most recent version of Acrobat Pro.
The new changes certainly indicate that google algo has created a niche for important pdf's on the top of SERPS.