Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Making an AJAX Site Machine Readable: Screen Readers, Google, PDFs?

         

TheMadScientist

10:37 pm on Jan 29, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is there a reason I shouldn't just detect screen readers and dump a PDF or a single page of HTML on them or GoogleBot (and other SEs bots)? I haven't done my homework yet, so I'm asking and I'll probably look to make sure myself, but it seems like instead of HTML files I should be able to just use PHP and output a PDF or the entire information as a single page of HTML fairly easily, and then insert anchor links in the HTML if people feel like surfing the page... Yeah, it might slow down the load speed, but the info is all there on a single page, so it's sure fast to use after...

Hmmmmm, if I let you cache it (a single HTML web page with all the information on it, or maybe a 'section' or 'chapter' or the 'toc' of a site) until there was a new version posted, then once you visited the site once you could load it from your cache? Interesting...

Would it maybe be better to make a single page of HTML links to PDFs of the individual content pages?

Seriously, what's the best way to take a site built in AJAX and make it available to everyone?
PDF, Big HTML File, maybe an HTML file with links to PDFs or just plain html?

Here's a shorter, different version of the question: Where's the technology the best for content to be read to a visitor with a screen reader? A PDF with all the information, full HTML, some combination of HTML and PDF?

I keep thinking if PDF technology is better for readers or better for download, browser, read, then why not noscript a page full of links to PDFs with the information or something similar?

I think it could even be done on sales sites if you had the PDF version of the page insert a phone number to call or include a link in full text to the 'place an order' page or something? I keep thinking 'download' a section of the catalog as a PDF for screen readers should be possible and fairly simple if they can easily be read as a PDF...

Thoughts, ideas, suggestions?

BTW: I don't care about about the site rankings right now, I want to know the best way to send the information to the visitor, so I really don't care as long as Google can read it, because, well I'm stubborn and I really do try to cater to the real visitors, so if a PDF is better for a visitor and worse for rankings, then I'll go with a PDF, but don't mind hearing about ranking impact, just know it's got a low priority on my er, uh, 'algo' (for lack of a better word) right now...

tedster

3:18 am on Jan 30, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's really going to depend on what functions the AJAX performs. And beyond that, a major question would be how big the final file size is going to be.

I'm not familiar enough with assistive technologies to know how they deal with PDF files, but certainly you can create a vanilla HTML file that is fully accessible. Google can and does index either type of file. And if there is enough PR (not all that much), then either one can stay in the index and rank OK, at least for its title.

Robert Charlton

4:00 am on Jan 30, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Is there a reason I shouldn't just detect screen readers and dump a PDF or a single page of HTML on them or GoogleBot (and other SEs bots)?

I'd think you'd want a default static html page served to Google, etc, and to have javascript serve dynamic content to users. Google's made it clear that they like this approach with SWF files, as long you don't take liberties with the text.

...then insert anchor links in the HTML if people feel like surfing the page

Both Google and Bing are indexing page fragment identifiers and linking, and I think both intend to take it further. Google is displaying the links to anchors as inline mini-sitelinks in the serps. I've had Bing simply take me to a deep section on the page with no mini-sitelinks displayed.

...if a PDF is better for a visitor and worse for rankings, then I'll go with a PDF, but don't mind hearing about ranking impact

I'm not a big fan of pdfs in serps. Pdfs were created as a document exchange format, and though I think Adobe has come a long way in improving its reader, I'd much rather navigate html pages.

Though there are some techniques for optimizing pdfs, I've run some primitive tests, and I feel that html pages have some overwhelming structural advantages when it comes to optimization.

Also, I don't see how a pdf could be the default page for the spider, the way an html page could, so for that reason too I'd serve up html as the default.

LiamMcGee

11:22 am on Feb 1, 2010 (gmt 0)

10+ Year Member



Screen readers are sophisticated pieces of kit nowadays - don't think that JAWS, HAL and chums can't get to grips with scripting and Ajax, because they can... as long as the scripting is well put together.

For some reading around the subject, try [w3.org...]

I'm happy to try to explain anything that is too weird/insubstantial (WAI-ARIA is a work in progress, so bear with us)

PDFs - again - modern screenreaders handle them with comparative ease IF they have been put together properly. Just like HTML, it's all about getting some extra meaning in... in the case of PDFs, this would usually be specifying headings, subheadings, sections etc. If you use 'Styles' in Word or Openoffice, the PDFs generated will usually retain those styles, giving plenty of navigation cues for screen reader users (and don't forget us keyboard-preferent sighted users, hey?)

For some reading on that, try [adobe.com...]

But I agree with Robert, HTML has some huge structural advantages over PDF as a format, and I'd stick with HTML for anything that wasn't primarily intended for the printed page.

Anything you do to make a filetype more meaningful is going to help any automated agent - assistive tech or bot. Whether Googlebot is paying much attention is hard to say, but I would if I were Googlebot. Certainly anecdotally whenever we sort out a site's accessibility problems, you tend to see a marked improvement in organic search.

LiamMcGee

11:30 am on Feb 1, 2010 (gmt 0)

10+ Year Member



Oh, and to be specific to the original post, any self respecting accessibility type would always serve well marked up, accessible HTML to a screen reader user in preference to a PDF.

Similarly, serve smaller, well chunked (with headings) HTML pages over a single huge HTML page (nothing more downheartening for a screen reader user to hear the software announce "Page has three hundred and forty links"... apart from possibly "Page has three hundred and forty links and twenty-two tables and nine frames")

If you want to make sure that your site works well for screen reader users, there is no substitute for getting someone with a screenreader to tell you. Get hold of a demo version (checking the license carefully, of course) and check yourself, or, even better, hang out where all the cool screen reader kids hang out and see if someone will give you a bit of free feedback.

Or spend 5k in proper usability testing with real live users with disabilities... if you have budget for this, I would heartily recommend it. As does the W3C.

TheMadScientist

6:39 pm on Feb 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@ LiamMcGee

Great Posts.
Thanks for sharing.