Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google indexing .pdf content vs. .aspx content

         

y0z2a

10:10 am on Sep 24, 2008 (gmt 0)

10+ Year Member



All,

I am after a little advice or thoughts on my question below:

After some initial work on a clients site that had not been indexed in google for a few months; it has been indexed, is and starting to list competitively under search terms.

The objective is to get the content indexed as competitively as possible as the content is very unique and very relevant.

Will a .pdf perform as well as an .aspx page?

My thoughts:

My understanding is that .pdf 'should' perform just as well as a .aspx if the content is good. The reality is that I don't see this in my day to day browsing, research and searching online. Everything but PDF's top SERP's on almost all but the most specific of searches (exception is normally scientific or industrial).

My thought process is:
a) To convert the copy from the .pdf into .aspx (or eqivalent) content pages
b) To then have the opportunity for the user to download the content as a .pdf at the end of the .aspx page
c) to possibly restrict access to the .pdf through either the sitemap or robots.txt. This then raises the question - Does Google penalise on the "duplication" of content from .aspx (or other) and a downloadable version of the content in a .pdf?

What are peoples views on what I have suggested as my course of action here, as I can see a number of pitfalls in approaching this in this way - mainly that there is likely to be an awful lot of work in converting 50 .pdf's into 50 .aspx pages that will not necessarily do any better than the .pdf's. The PDF content could then perform very well independently regardless, giving a two pronged attack? *EDIT* And that the content may be seen as duplicated within the .aspx and the .pdf?

All the best,

/y0z2a

rainborick

1:36 pm on Sep 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



.pdf files rarely perform well in the search engines because they lack much of the semantic mark-up and document linking capabilities that are available in X/HTML documents - which is what .aspx pages output. Your idea to use .aspx/HTML and make the .pdf versions available for download sounds good, and it is a good idea to store them in a directory that's blocked in your robots.txt file just for insurance. The truly paranoid in me would also suggest that you add a rel='nofollow' to any links to those .pdf documents, just to further insure that if one of those .pdf files is ever accidentally indexed (or happens to be indexed already), it will never be chosen as the canonical version.