Page is a not externally linkable
- Code, Content, and Presentation
-- Apache Web Server
---- Is it valid HTTP-wise to redirect a PDF to an HTML file?


1script - 9:44 pm on Feb 27, 2013 (gmt 0)


Thanks, Andy. No, I would also redirect Google. I have no desire to have the PDF files rank by themselves - only the pages they are linked from. In fact, I suspect that having indexable PDFs is now hindering the ranks of the HTML page it's linked from - the KWs the PDFs are ranked for are almost invariably the title of the PDF which was used to link to them. In other words, since the text with that title is actually on the HTML page but within a link to elsewhere, Google must be "thinking" that elswhere is better than right here.

Anyhow, the issue of ranking aside, like I said, I don't really want them to rank and if I only returned X-Robots-Tag: noindex , they could still be downloaded in a drive-by fashion. I want to eliminate the whole notion of using these PDF files as standalone sources of information, in other words there should be only one way to get them - visit the HTML page and download by clicking the link on it.

I was actually thinking specifically of the technical issue with it that you eluded to: the PDFs are actually served as application/pdf and not text/html . In fact, I think I should see if I've Apache setup with "ForceType application/pdf" because their main use was for printing and I though it wouldn't make much sense to have them open in browser.

Perhaps my caffeine is taking way longer than usual to kick in today, but I am thinking a browser would "expect" a binary file - how will it handle a text/html being served instead? Actually, no, it would get a 301 redirect HTTP header served and will probably not follow it. Just like you cannot really redirect an image to a page - all the browser will do is show an error page within the square defined for the picture (if the size is known).

Anyway, like I said, I'm in the beginning of the research on the subject, and I'm probably not making much sense at this point, so I would appreciate if you point out any holes in my logic.

Oh, and on the subject of converting from one way of handling no-referrer PDF requests to another, I should probably take that to the Google forum - I do want to proceed with extreme caution and would like to avoid bringing down the house in the process.

Thank you for your input!


Thread source:: http://www.webmasterworld.com/apache/4549593.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com