| 4:38 am on Jan 28, 2006 (gmt 0)|
Anne van Kesteren has an explanation on his weblog. Perform a google search on the following text: "text/xml is seriously broken over HTTP". Be sure and read the comments.
| 4:56 pm on Jan 28, 2006 (gmt 0)|
Thanks confuzed2, I'm aware of that article, but I can't find any meaningful specification or explanation for Firefox's behavior. Usually Mozilla prides itself on being standards-compliant, especially for its XML parser, so I would be surprised if the choice of serving a
text/xml document as UTF-8 was accidental.
| 6:41 pm on Jan 28, 2006 (gmt 0)|
If the 3023 and text/xml discussions in Bugzilla don't help, I'm at a complete loss. Please let us know what you find out.
| 9:31 pm on Jan 28, 2006 (gmt 0)|
RFC 3023 is (one of) the most ignored/violated RFCs by Internet software. Many (most?) ignore the headers looking directly to the interior XML encoding defaulting to utf-8 if none.
This is for a very practical 'real world' reason: US-ASCII is a very small part of the real world. Consider KO18-R (or Big5 or utf-16 or...) encoded xml sent via http specifying text/xml but without charset ... an RFC 3023 compliant parser would serve up US-ASCII glop. These products simply decided that a utf-8 default will cause fewer complaints than US-ASCII and so informally 'revised' the standard.
I do not know if FF behaves this way or for this reason but if so it would certainly be with the majority.
| 1:44 am on Jan 31, 2006 (gmt 0)|
|I do not know if FF behaves this way or for this reason but if so it would certainly be with the majority. |
I found a few Bugzilla conversations and a few submitted patches, so I get the impression that later Firefox versions are going to implement RFC 3023 more strictly.
I found the best answer to my question in an article from XML.com which is rather well-titled XML on the Web Has Failed [xml.com]. RFC 3023 dictates the primacy of HTTP over an internally-declared charset, but real-world implementations mean that that primacy has to be ignored simply due to the sheer number of feeds which would be considered ill-formed if RFC 3023 was followed to the letter. A real eye-opener.
I was aware of the strong recommendations never to use
text/xml for anything, but the reasons why are much clearer to me now!