Forum Moderators: open
<?xml version="1.0" ?>
<element>content</element> I serve the file with a MIME type
text/xml over HTTP and view the result with Firefox. I have not specified any character encoding either via HTTP or within the document and there is no BOM. What should the charset be? On reading RFC 3023 [ietf.org], I get the impression it should be US-ASCII. Only if I serve the file as
application/xml should it be UTF-8. However Firefox considers it to be UTF-8 even with text/xml. So who is wrong, Firefox or me?
text/xml document as UTF-8 was accidental.
This is for a very practical 'real world' reason: US-ASCII is a very small part of the real world. Consider KO18-R (or Big5 or utf-16 or...) encoded xml sent via http specifying text/xml but without charset ... an RFC 3023 compliant parser would serve up US-ASCII glop. These products simply decided that a utf-8 default will cause fewer complaints than US-ASCII and so informally 'revised' the standard.
I do not know if FF behaves this way or for this reason but if so it would certainly be with the majority.
I do not know if FF behaves this way or for this reason but if so it would certainly be with the majority.
I found a few Bugzilla conversations and a few submitted patches, so I get the impression that later Firefox versions are going to implement RFC 3023 more strictly.
I found the best answer to my question in an article from XML.com which is rather well-titled XML on the Web Has Failed [xml.com]. RFC 3023 dictates the primacy of HTTP over an internally-declared charset, but real-world implementations mean that that primacy has to be ignored simply due to the sheer number of feeds which would be considered ill-formed if RFC 3023 was followed to the letter. A real eye-opener.
I was aware of the strong recommendations never to use
text/xml for anything, but the reasons why are much clearer to me now!