Forum Moderators: Robert Charlton & goodroi
If you want to publish documents on the web, put them in a web format.
Which means you can bet that adobe are trying to find ways to do exactly what the OP asks; just a matter of time.
Meanwhile, the only hope for the rest of us is that Google buys Adobe and makes it free - before M$ buys Adobe and succeeds in making it 100% restricted!
[edited by: Quadrille at 1:54 pm (utc) on July 23, 2007]
If the view as html looks bad, but the content seems like it is going to be useful and would benefit my looking at it as a pdf I will probably view it in adobe, but it's so slow to load, that it's a rare day when that happens.
About the only things I use pdfs for is emailing certain documents (invoices, resumes, contracts, etc)
[edited by: Gibble at 2:12 pm (utc) on July 23, 2007]
That would be a major contribution to information distribution on the Internet!
A little background for my question. The idea is to offer an eBook for free on the website. It's got more than 50 pages, therefore I thought a pdf might be more appropriate than an html document (ok, I might split it into 50 html documents, but who will read 50 html pages? IMO potential readers will want to print it).
Whole story about this ebook is that it's supposed to act as a linkbait (well, reading your comments I might have to reconsider this ;).
The site owner is ok to give this ebook for free but as a reward, he wants to get trafic on his website. So my idea was to allow search engines to index the pdf, but redirect to the page of the website that offers downloading the pdf all human visitor who would try to download the pdf from outside the website, especially from the Google SERPs (well that's kind of cloaking but I think it's no issue since it's not deceptive for the user).
That's also why I thought it could be a good idea to kill the "View as HTML" link in Google SERPs which would be a killer for the website's trafic. Am I totally wrong?
If that is the motivation, then I would suggest providing it as HTML pages (50 of them if necessary) and having a link (appropriately excluded from robots) to the PDF for printing purposes.
It looks like you have already decided (and it seems a viable solution), but I would also consider providing a page or two of html as a 'teaser' and allowing full PDF to be downloaded to read / print the remaining portion, because I'm also a card carrying member of the 'not a chance I'm going to read 50 html pages crowd'.
Justin
I don't have one at home...well, not one that's hooked up. And we have one at work, that is only used by the boss. I can safely say, I haven't printed ANYTHING in 5 months.
But back on topic, I'd use a good print css style sheet, and let users print from their browser rather than maintain two files (one html, one pdf)
[edited by: Gibble at 2:16 pm (utc) on July 25, 2007]
jd01, I like the teaser idea. Moreover, after thinking twice, a big drawback I see in the "50 html pages" idea is that Google will send trafic to any page of the document (unless I cloak), so that it has to be read in the right order to make sense.
On the other hand, 50 html pages is probably good for the site's SE ranking.
I'm a bit confused. I should probably look for the best in the "make sites for humans not for search engines" philosophy. But which is it?
That's the million dollar question...
You have a few options I can see:
1. Put two (or the desired amount of) PDF pages on a single longer html page and allow visitors to download the PDF from there.
2. Go ahead with the 50 page idea and 'noindex,follow' all pages, except the first one.
3. Go ahead with the 50 page idea and use 'header tags' to indicate the 'starting point' document in the collection of documents to search engines.
Justin
...it has to be read in the right order to make sense.
All you need is a clear navigational structure so that a user can arrive at any page and have an intuitive sense of what's going on and where they are. And if there are certain HTML pages you don't want indexed, just use a NOINDEX meta tag. Cloaking is not required.
On the other hand, 50 html pages is probably good for the site's SE ranking.
More pages are not necessarily better. More pages help if they make your site more granular and get you inbound links for phrases people search for.
For SEO purposes, I'd break the content up if it logically breaks into topical areas that you can target for search... say, as chapters in a book. If done properly, this also helps by providing readable chunks for your users.