I'm trying to load the contents of a URL in JavaScript and get back a fully functioning DOM document. Things are easy if one is loading XML or XHTML, but I need to load HTML.
- I've tried with XMLHttpRequest:
- Overriding content-type with text/html and hoping responseXML would have a usable DOM
- Inserting responseText into a newly created iframe element as innerHTML
- Setting responseText as innerHTML on a DocumentFragment
- Removing <html>, <head>, and <body> from responseText, and setting this as innerHTML on a newly created div element
- And with iframes (which I'd rather avoid):
- Setting URL src on an unattached iframe element and trying to read out its document dom
In the few occasions I've been able to coax out a DOM, for some reason XPath queries totally fail against it. The nodes are there, but I can't get a match. I'm no XPath expert by any means, and it's easily possible that I'm not getting all the contexts correct.
Anybody pull this off in Mozilla? It seems like this should be simple, since, you know, the main thing the browser does is load HTML.
Don't need any cross-platform advice, thanks, this is for an (unprivileged) Mozilla extension.