Forum Moderators: open
Perhaps I should admit that I don't follow my own guidelines [webmasterworld.com] for my own sites. :) For years I have used the following:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> It's shorter than the recommended HTML 4.01 Strict doctype, it is still valid and triggers standards-compliance in all modern browsers with the exception of IE5.x Mac (this is the only functional difference with the HTML-strict-with-DTD-url version). You don't need to follow exactly what the W3C says, and the fact that the HTML5 development is going to eliminate doctypes (apart from the vestige mentioned above which is only needed to cater for the standards-mode switch in browsers) just shows that the DTD is unnecessary.
I've coded carefully in 1.1 strict, with CSS 2.1... If I used a DTD such as yours, would I need to worry about my code being "backward-compatible"? (Is 4.1 just XHTML in disguise and by a different name?)
By 1.1 strict do you mean XHTML 1.1, or XHTML 1.0 Strict, or something else? (There are no Strict or Transitional versions for XHTML 1.1)
In any case, apart from the trailing slashes for empty elements, there is no fundamental difference when moving from XHTML 1.x to HTML 4.01. The browser doesn't consider the doctype, only the MIME type, when handling the page.
Should you swap? The issue is all very theoretical, apart from the effect on the W3C servers.
Embedded, mobile etc. devices need to pull the DTDs every single time from the source.
They shouldn't need to pull them at all - browsers use an internal DTD, I don't know of one that behaves any differently dependent on doctype apart from the "doctype-switch" where certain specific blacklisted doctypes (or absence of a doctype) trigger a quirks mode. IE, Firefox, Opera, Konqueror, Safari, none of these browsers download DTDs for anything served as
text/html. Like the W3C team said, it's not a link, it's a reference. I'm convinced the problem is rogue bots, nothing to do with browsers.
What the w3c forgets is that not everyone has the space to cache these DTDs. Embedded, mobile etc. devices need to pull the DTDs every single time from the source.
Browsers do *not* use DTD's (Document Type Definitions), even/especially mobile browsers(!) Browsers are not based on validating-parsers, and anyway DTD's cannot fully describe HTML syntax. From [w3.org...]
The HTML 4.01 specification includes additional
syntactic constraints that cannot be expressed within
the DTDs.
Browsers *do* use the Document Type Declaration (the line at the top of your HTML page that currently refers to a Document Type Definition), but only as a switch to trigger standards mode/almost standards mode.
Browsers do not have separate layout engines for HTML2.0, HTML3.2, HTML4.01, they just have a layout engine for "HTML" with quirks/almost standard/standard modes (some have an additional XML-based layout engine too).
Document Type Definitions are a relic left over from HTML's SGML origins. They're currently only useful for checking your page with a validator, and triggering a standards mode. Browsers do not (could not) use DTDs to parse web pages, and HTML5 (rightly) removes the Document Type Definition altogether.
I've coded carefully in 1.1 strict, with CSS 2.1... If I used a DTD such as yours, would I need to worry about my code being "backward-compatible"? (Is 4.1 just XHTML in disguise and by a different name?)
Your XHTML1.1 doctype declaration does not switch the browser into XHTML mode. The only way to get a browser to treat XHTML as XHTML is to use the XHTML mime type "application/xhtml+xml".
But IE doesn't support this mime type so don't do this, and there are other disadvantages too (fail on error, turns off incremental rendering). As XHTML1.1 must be served with the XHTML mime type you should not be using XHTML1.1 at all.
As you currently serve your XHTML1.1 document with the HTML mime type "text/html", browsers are just treating your XHTML as malformed "HTML". All your doctype declaration is doing is switching the browser into standards mode.
So, if you want to continue using XHTML syntax rules, just switch to a XHTML1.0 Strict or Transitional doctype declaration, but bear in mind browsers will just treat your page as HTML with lots of closing-slash errors in it.
Alternatively you can switch to a HTML4.01 Strict or Transitional doctype declaration. Browsers will continue to parse your content in the same way*, but you would have to remove your extra closing slashes to validate your pages. (Note that HTML5 allows the closing-slash syntax in its HTML version.)
*I'd recommend reading encyclo's Choosing the best doctype [webmasterworld.com] referred to above if you haven't already.
(1) my websites are rather simple-minded (a mirror?), so that probably helps, and...
(2) possibly I don't test on ENOUGH computers
So, thanks for your deep chain of links, which I'll study. They're quite different opinions from e.g. some of the O'Reilly books I've memorized over the years
Oh, and, in the code snippet I posted above, I omitted:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
...so, as I'm seeing so far in the links, I'm already NOT, as recommended, serving it up as xhtml...I guess, it's not XHTML1.1, anyway in spite of the validator's message that it is?
An XML parsing app in an embedded device or mobile should not have to resolve an external DTD reference. There's a good discussion of this linked on the other thread - DTDs Don’t Work on the Web [hsivonen.iki.fi]
albo, the "best practice" advice on using XHTML or HTML has changed over the years. You're definitely approaching web development with the right mindset by ensuring you browser test your documents, and validation is a useful QA/debugging tool, whatever syntax you choose.
added: the validator only checks XHTML1.1 syntax (not the mime type). Browsers will treat your XHTML page as malformed HTML unless you use the XHTML mime type. Browsers *only* use the mime type HTTP header (not the meta http-equiv tag) to determine whether a document is HTML or XHTML.