homepage Welcome to WebmasterWorld Guest from 54.226.235.222
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Browsers / Firefox Browser Usage and Support
Forum Library, Charter, Moderators: incrediBILL

Firefox Browser Usage and Support Forum

    
MIME types, text/xml and Firefox
Am I misunderstanding the spec?
encyclo




msg:1589144
 3:28 am on Jan 28, 2006 (gmt 0)

OK, so I have a simple XML document:

<?xml version="1.0" ?>
<element>content</element>

I serve the file with a MIME type text/xml over HTTP and view the result with Firefox. I have not specified any character encoding either via HTTP or within the document and there is no BOM.

What should the charset be? On reading RFC 3023 [ietf.org], I get the impression it should be US-ASCII. Only if I serve the file as application/xml should it be UTF-8. However Firefox considers it to be UTF-8 even with text/xml. So who is wrong, Firefox or me?

 

confuzed2




msg:1589145
 4:38 am on Jan 28, 2006 (gmt 0)

Anne van Kesteren has an explanation on his weblog. Perform a google search on the following text: "text/xml is seriously broken over HTTP". Be sure and read the comments.

HTH,
CK

encyclo




msg:1589146
 4:56 pm on Jan 28, 2006 (gmt 0)

Thanks confuzed2, I'm aware of that article, but I can't find any meaningful specification or explanation for Firefox's behavior. Usually Mozilla prides itself on being standards-compliant, especially for its XML parser, so I would be surprised if the choice of serving a
text/xml document as UTF-8 was accidental.
confuzed2




msg:1589147
 6:41 pm on Jan 28, 2006 (gmt 0)

If the 3023 and text/xml discussions in Bugzilla don't help, I'm at a complete loss. Please let us know what you find out.

Thanks,
CK

iamlost




msg:1589148
 9:31 pm on Jan 28, 2006 (gmt 0)

RFC 3023 is (one of) the most ignored/violated RFCs by Internet software. Many (most?) ignore the headers looking directly to the interior XML encoding defaulting to utf-8 if none.

This is for a very practical 'real world' reason: US-ASCII is a very small part of the real world. Consider KO18-R (or Big5 or utf-16 or...) encoded xml sent via http specifying text/xml but without charset ... an RFC 3023 compliant parser would serve up US-ASCII glop. These products simply decided that a utf-8 default will cause fewer complaints than US-ASCII and so informally 'revised' the standard.

I do not know if FF behaves this way or for this reason but if so it would certainly be with the majority.

encyclo




msg:1589149
 1:44 am on Jan 31, 2006 (gmt 0)

I do not know if FF behaves this way or for this reason but if so it would certainly be with the majority.

I found a few Bugzilla conversations and a few submitted patches, so I get the impression that later Firefox versions are going to implement RFC 3023 more strictly.

I found the best answer to my question in an article from XML.com which is rather well-titled XML on the Web Has Failed [xml.com]. RFC 3023 dictates the primacy of HTTP over an internally-declared charset, but real-world implementations mean that that primacy has to be ignored simply due to the sheer number of feeds which would be considered ill-formed if RFC 3023 was followed to the letter. A real eye-opener.

I was aware of the strong recommendations never to use text/xml for anything, but the reasons why are much clearer to me now!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Browsers / Firefox Browser Usage and Support
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved