Forum Moderators: open
Ive been doing a bit of reading on html 4.01 and xhtml 1.0 and the use of content type.
However, i have a problem with xhtml 1.0 and application/xhtml+xml and what content type the browser sees (Firefox 3 for this test)
I have this page:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Base xHTML 1.0 Strict</title>
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
</head>
<body>
<h1>Test</h1>
</body>
</html>
If I look at the page info in Firefox it reports the type as text/html. However a meta tag exists reporting application/xhtml+xml as the content-type. If i examine the HTTP headers using a Firefox extension it reports the page being served as html/text and if I make the request using putty the content type is again html/text. The w3c validator also reports it as text/html. I can force application/xhtml+xml using HTTP headers in IIS. Does IIS do something as default? Why isnt application/xhtml+xml used just by specifying it in the document?
Please help me understand this behaviour.
It is also possible (conjecture here, not tested) that the on-page meta-tag plays some part in IE's determination of document type when no other 'signals' are present. To digress, IE seems to ignore the Content-Type header returned by servers, and instead uses some kind of 'inference engine' to determine the document type. This behavior is not in compliance with the HTTP protocol, but that's how IE does it...
So, back to the server-provided HTTP Content-Type header: Most servers are configured with default settings to return proper Content-Type headers for common HTML documents and images. However, if a file having an unrecognized file extension is served, it's likely that the server will default to sending a text/html Content-Type header.
Many servers are not set up to send correct Content-Type headers for 'newer' document types such as xml, xml+xhtml, or even gzip-compressed files. Some are not set up for media files such as mp3, mpeg, avi, wmv, flv, etc., or even for png. It's pretty much up to the Webmaster to examine the headers for each filetype on the server, and verify or correct the Content-Type header configuration for each.
So, your proposal to add this content type (as well as any others which may not yet be defined) to your IIS configuration is the correct approach.
Jim
Thanks for the response. Its good to know that its IIS thats deciding what content-type goes along with a served document (type).
It does puzzle me that a lot of sites i have read suggest including the meta content-type as a crucial part of document structure. Im not saying it shouldnt, but if it doesnt get used as part of the request/response cycle where does its real benefit come in.
David
[added] This is true with *any* meta-tag given as "HTTP-equiv" -- That phrase has a very specific meaning, in that the attributes so marked are to be taken as the equivalent of HTTP headers (which should be present in the server response when the document is served by a server). However, the 'priority' and scope of these equivalent headers is poorly defined, and various browsers interpret and scope them in different ways. [/added]
Jim
[edited by: jdMorgan at 2:00 pm (utc) on Oct. 2, 2008]
As IE, some other older browsers and most automated bots (such as Googlebot) do not support
application/xhtml+xml you should stick to text/html at all times, including for XHTML.
My understanding would be that if i serve an xHTML 1.0 strict document from ISS (with htm being served as application/xhtml+xml) to IE6, that it shouldnt display. Im serving such a page just now, and it is.
No xmlns in the html element will cause the page not to show properly in firefox (as expected) but in IE6 it will display and be styled
/EDIT
If i then had an xhtml1.1 page (which still has a htm extension) wouldnt that mean it was served as html and not xml, which is technically not something that should happen. Not that i would likely mix and match pages, but to get xhtml1.1 with IIS you need to specifically alter the HTTP Headers.
/EDIT
If your document does not declare at least one proprietary namespace that you have defined (or have specially selected) to add functionality beyond HTML capabilities, then the document probably does not need to be XML.
As a simple example, the recent upgrade of Google's SiteMap format added a <mobile:> tag and a namespace to define the name of that tag and its allowable attribute values.
Jim