Forum Moderators: open

Message Too Old, No Replies

Converting to xhtml, part 2

What did I gain?

         

Mohamed_E

7:10 pm on Mar 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had no intention of converting from HTML 4.01 Strict to xhtml any time soon, since I felt that I still had a lot of learning to do. Still, reading about xhtml made it seem rather simple: convert all tags and attributes to lower case, close all tags, and that's about it. It looked too simple to be true.

Just to test it out I used html-tidy to convert one page to xhtml 1.0 Strict; it converted easily and validated! Converted a couple more just to see. They validated.

OK, changed the DOCTYPE to xhtml 1.1. Here I ran into two problems.

  1. I have many tables with a <thead></thead> section, followed by the rows of the table proper, without the enclosing <tbody></tbody>. That validates fine with HTML or xhtml 1.0, Strict in both cases, but not with xhtml 1.1. Easily fixed.
  2. My pages are large, usually a few screenfuls, so on-page navigation is essential. xhtml 1.1 does not allow <a name="abcd">, NN4 does not understand <a id="abcd">. So I decided to stay with 1.0 for another year or so until NN4.x really and truly dies.

One thing leads to another, and soon I had converted all the files. There was a lot of annoying work to do because of the way I maintain my site, with lots of include files. Two or three hours of editing, writing scripts, and occasionally cursing. Then I had to change the html-mode.el file that my editor (emacs) uses, from an old one with upper case tags and unclosed <P> and <LI> to a newer one with lower case and closed tags.

Now for the question: What did I gain by doing so? I understand the benefits of valid Strict HTML, what are those of xhtml?

Note: I will be checking the files on my computer for a couple of days before exporting them!

asquithea

8:17 pm on Mar 16, 2004 (gmt 0)

10+ Year Member



If you've previously been working with Strict HTML 4.01, then you haven't gained a lot, especially if you're going to carry on supporting old browsers that don't even understand XHTML. Bear in mind that if you're not serving up the XHTML mime-type, the browser isn't reading it as XHTML. This goes even for newer browsers like Mozilla.

I'd say the minor benefits include:
* Better consistency amongst pages
* Potentially faster page parsing on newer browsers

The major benefit (should it affect you) is that as a dialect of XML you can create a site with XSLT transforms and an XML data backbone with comparatively little effort. Increasingly we're seeing integrated XML support in scripting languages (cf PHP5) and in Databases. With XSLT and XHTML it's simple to use the same back-end to serve your data up to both browsers and web-services.

Beyond that, there's not a lot. The key differences between HTML and XHTML are that the latter is easier to parse and is interoperable with data formats.

Bonusbana

8:23 pm on Mar 16, 2004 (gmt 0)

10+ Year Member



Well, let me first say that I am far from an expert in the area. Actually I am sort of a novice. The main reason I got into xhtml is basically because I realised how strong CSS design was. I love the Idea of separating content from design and I will never ever use tables again except when I really need to.

I would say that in your situation there is really no good reason why you should go further than xhtml 1.0 strict if you are not using xml for structure. Actually I find xhtml transitional to be a good in-between language if you want to be fully compilant and still use the good old features.

david

encyclo

8:35 pm on Mar 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The benefits of XHTML over HTML 4.01? That's a thorny issue, as there are the pro and anti-XHTML camps out there.

As for me... I'm anti-XHTML, for a couple of reasons - in no particluar order:

1) The mime-type issue - XHTML should be served with the mime-type

application/xhtml+xml
rather than HTML's
text/html
. All well and good, but Internet Explorer does not support that mime-type. Even if it did, you get another issue - if you use
application/xhtml+xml
, you're sending XML - which means that if there is one tiny validation or well-formedness error, the site will not display. You would lose the flexibility that HTML offers, in that errors are not fatal. You would lose 99% of the web if every page had to validate - and when you're managing a large, dynamically-driven site, you'd have to be insane to go down the
application/xhtml+xml
route.

2) The fallacy of the compatibility of XHTML and HTML - constructs such as trailing slashes (like <br />, etc.) have a different meaning in HTML, and the irony is that you are depending on the browsers error-handling to parse them. Similarly with the XHTML doctypes and xmlns attributes. Because you are forced to use

text/html
, the browser must parse it as HTML not XHTML - so the XHTML-specific code is ignored in the same way any other invalid HTML attribute would be.

3) The myth of forward compatibility: assuming that there comes a time when all browsers can read

application/xhtml+xml
, if you try to simply change the mime type of your existing pages, you are very likely to have problems - CSS and javascript are both handled differently, so your layout may well be affected. What's more, as the vast majority of XHTML documents are invalid, they couldn't be served with the correct mime type anyway.

That's a lot of waffle, but the basic fact is that XML is a very useful server-side tool, but is far too inflexible in its error-handling to be used properly on the client-side. One of the basic principles of the web is it's ease of use and fault-tolerant environment. XHTML fails in this regard.

All this is better explained here:

[hixie.ch...]

asquithea

9:17 pm on Mar 16, 2004 (gmt 0)

10+ Year Member



Just to follow up on a few of the above points:

I see no reason why XHTML support won't be in the next major release of all commonly used browsers. As far as I know, they're all capable of reading XML to some degree -- even MS IE. The length of time you'll be supporting older browsers, of course, is down to your target audience. Personally, I haven't seen a browser older than IE 5.5 at my site lately.

XHTML mime type support can be sniffed without needing to track individual browser versions, so you can at least serve the mime-type correctly to browsers that support it.

If the behaviour of your CSS or Javascript changes when you turn on strict rendering, then it's broken already. By turning on strict rendering, you greatly improve your chances of cross-browser consistency, which makes your life easier in the medium and long run.

Strict validation of web-pages in all cases may actually be a desirable goal. In the programming context, a compiled language provides greater reliability than an interpreted language. Indeed, language evolution has shown that the stricter the compiler, the more productive a programmer can be. I believe that the same is true on the web. Again, on my 10000+ page site, error hunting would have been practically impossible without strict validation.

It's certainly true to say that depending on a legacy browser's interpretive ability to parse the malformed HTML (actually XHTML) is not ideal, but to go on to argue that strict validation is a bad idea for the future seems a little perverse :-P

--

All that stuff given, I don't think that there's a large benefit to be had by the switch unless you're responsible for a data-driven site (but I said all that in my first post anyway).

I'm not really pro or anti. More, "right tool for the job". (I couldn't raise the advocacy site)

mattur

2:20 pm on Mar 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The main benefit is well-formed xhtml can be processed with an xml parser. Good html can also be easily parsed, via other methods (e.g. Perl HTML::treebuilder) or by making an intermediate step to convert good html -> xhtml before using an xml parser.

For database-driven sites this is irrelevant - there's no need to use a presentation level, semantically-poor format as the data interface.

The other benefit is xhtml is currently perceived by many web designers as cool and trendy.

Some other benefits are often cited including accessibility, "future compatibility" and mobile device support, but these points do not stand up to scrutiny.

The drawbacks are: marginally longer pages, some legacy workarounds may be needed, dependence on error handling to render in IE, and brittleness on xhtml-accepting browsers (if served with correct mime type).

Disclaimer: i've yet to need to use xhtml, and consequently have a biased viewpoint about it's utility (or lack thereof)... ;)

Purple Martin

10:43 pm on Mar 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As far as I know, they're all capable of reading XML to some degree -- even MS IE.

XML is supported by MS IE 5.0+ and by Mozilla 1.0+ and by Netscape 6.0+.

It is not supported by Opera or other browsers.

(I once wrote some JavaScript that reads and parses an XML file for use in dynamically building page content.)