Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

What Is The Current Thinking On Google & PDFs?

         

RedBar

3:23 pm on Mar 9, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



We've been using pdfs for decades using intensive widget information and images for both email and web page information.

Do you feel that G spiders them as an equal to a web page?

Can a pdf outrank the exact same web page?

How often does G search update a pdf, any ideas?

Have you found there are any specific dos and don'ts when creating a pdf for a website?

The only information I have found seems to be several years old, is there any recent newer guidance available?

Any other advice or questions?

Andy Langton

4:32 pm on Mar 9, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



My experience is that PDFs are worse in almost every respect, e.g. lower clickthrough rate, no title or meta data, worse user experience (especially as a direct visit from search and on mobile devices) often untracked in traditional analytics.

I think this is to be expected as PDFs are not a format designed for browsing on the web. Whenever I see PDF that ranks decently well I immediately recommend a web-friendly version be created (typically with a link to the PDF as a "download" option).

>> How often does G search update a pdf, any ideas?

I haven't noticed any particular difference between PDFs and web pages in terms of crawl frequency - depends on links and how often the content changes, mostly.

Dimitri

6:49 pm on Mar 9, 2020 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



As a "web" "user" point of view, I find PDF really not friendly. If I see two links, one a web page, another for a PDF, I'll go with the web page.

As about indexing, I guess that PDF might be harder to parse for search engine, because of again more advance page layout system. They parse them, but I suspect it's a lot more resource insensitive, so crawlers might not put the priority on indexing them.

iamlost

10:24 pm on Mar 9, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



While not as technical as I suspect are RedBar’s sites I do have topics that, if not actually require, certainly accrue value from primary sources such as research papers that often are PDF format.

Yes, I could, and do, link out to such, however I prefer where possible to link in. Many such, at least in my niches, are willingly shared. Indeed the mere fact of my interest is often enough for years of continuous contact and further resources.

What I have always done is to write the topic/page to the target audience (define as appropriate) as an overarching story with quotes from and links to supporting PDFs et al with anchor text supporting description of type and style of file so those who want to dive deep can and those who shudder can ignore.

Initially I had such PDF files open to crawling but over the years most if not all are now robots.txt excluded. A business requirements decision.

It allows the best of both worlds: conversational/informational/marketing normal easily digested HTML content out front and the more dense/technical PDF content in solid support.

iamlost

10:36 pm on Mar 9, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see that I did not actually answer the topic title question...

In my experience, both as webdev and searcher, Google has little if any problem with PDFs. In my query returns I frequently see appropriate PDFs; also images from PDFs are shown in image searches.

Rather my concern is the visitor/customer as each browser has an idiosyncratic approach to PDF rendering, which can be interesting even problematic especially with mobile.

tangor

6:13 am on Mar 10, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Both formats work for me ... both have slightly different purposes ... and both are enjoyed by the USERS that get the difference (and appreciate it).

PDF has no apparent indexing problem(s).

G, sometime back, said they could index PDF ... I ceased to worry about it from that time to present. If things have changed I haven't heard about it.