homepage Welcome to WebmasterWorld Guest from 23.20.19.131
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

This 63 message thread spans 3 pages: < < 63 ( 1 [2] 3 > >     
XHTML 2 vs HTML 5: let'em clash!
What do webmasters have to say about the future of the Web?
Herenvardo




msg:3730084
 11:49 pm on Aug 24, 2008 (gmt 0)

Straigth into the topic: within the W3C there is work in progress in two separate successors for HTML4.x / XHTML1.x:
XHTML 2.0 [w3.org] bids for a revolutionary and entirely breaking approach, redefining the language from zero as an XML application.
HTML 5 [whatwg.org], on the other hand, takes an incremental approach bidding for backwards compatibility and homogeneus error handling.

If you ask Google [google.com], it spits out roughly 80k answers to the issue in less than half a second (fortunatelly, you can check only those that appeal you, and you can take as long as you want to review them ;) )

Whatever happens there, it will surely affect us, the webmasters. And what a better place to discuss the topic that at Webmaster World?

Basic guidelines: this is mostly common sense, but I think some points should be stated out.

This is a debate, not some sort of exam: feel free to say what you think and, especially, why do you think so. There is no goal of achieving any "consensus" or common ground, the only goal is to gather as many opinions as possible, so we all end up with a bigger picture of the whole topic.

This is a debate, not a battlefield: this topic is already polarizing in several sectors, and a debate like this might start heating up as soon as there are a few members on each side. I think it's worth to remind here the point 14 of the ToS [webmasterworld.com]:
14. Please keep your language clean and decent. This include personal inflammatory language as well as obscenities.

As a general suggestion: if the discussion is too heated, preview your posts and read them; then decide whether to hit "Submit" or go to take a cold shower and rewrite the post afterwards.

This is not a competition: it's unlikely that we will be changing anything from this thread, so don't get obsessed. There are no winers and losers in this debate: if we end up with a deeper understanding of the goings-on of the process then we all win; if we end up with everybody being upset or angry, or we manage to get this thread blocked due to uncivilized behaviour, then we all lose.

In summary, try to exercise your freedom of opinion and speech, and make your best to respect these freedoms' of others.

Ready? GO!

 

Herenvardo




msg:3779961
 3:01 pm on Nov 4, 2008 (gmt 0)

The namespace clash? I must admit I didn't understand your point

Then let me put a simple example. Assume a perfectly valid, 100% compliant, XHTML1 Transitional document including something like this:
<p>This is a valid paragraph with <b>some bold text</b>.</p>
According to the current spec:
When a document is transmitted with an XML MIME type, such as application/xhtml+xml, then it is processed by an XML processor by Web browsers, and treated as an "XHTML5" document.

and
To ease migration from HTML to XHTML, UAs conforming to this specification will place elements in HTML in the [w3.org...] namespace

So an X/HTML5-compliant browser must treat it as an XHTML5 document. Which means that HTML5's elements will override XHTML1's in the [w3.org...] namespace used by our example XHTML1 document. Even if non-HTML5 elements (such as <b>) are left untouched, the content model for other elements is changed. This means that the new <p> element defined on that namespace doesn't allow anymore <b> elements (because HTML5's <p> doesn't allow <b>). So, the perfectly valid old page miserably breaks with an XML validation error on new, HTML5-compliant browsers.
IMO, requiring compliant UAs to break older, entirely valid documents is a good example of bakwards incompatibility. And it's not just a "theological", theoric, or academic example: it's a quite real world example describing how the spec is requiring new UAs to break thousands, or even millions, of well-formed pages (while, at the same time, requiring them to perfectly process markup aberrations that never were conformant, which is quite an inconsistency, but it's a separate issue).

I think well-formedness is an issue with most pages on the web today - especially, but not exclusively, with the HTML ones

Is it? (non-X)HTML pages are not affected at all by this, at least from the browser's perspective: the browser will figure out what that mess of tags is supposed to mean, and do its best to render it.
OTOH, if all browsers said something like "this is not a webpage, but a crappy non-sense of markup" and refused to render them, I'd be quite happy with that. I already mentioned my POV on lenient error handling, and the use-cases I think it's appropriate for (embedding user-provided content that can't be verified, like in blog or forum sites). Authors being too lazy to check what the hell are they doing is not such a use-case: a site from such an author probably doesn't deserve the effort UAs will put in rendering it; nor the effort users will put into finding something useful or relevant among the mess of badly-sized spacer gifs, table cells without rows or even tables, and stuff like that.
Furthermore, if the document includes a doctype or xml declaration, not rendering it is the best thing a UA could do for the user: such page is claiming to care about standards and interoperability, but it actually isn't; if a page is so blatantly lying, then it shouldn't be trusted at all.
XHTML2 doesn't impose any requirement on existing content. On the contrary, the spec makes sure that only content explicitly marked as XHTML2 will be treated as such. If XHTML2 was adopted, legacy content should be treated as it was before XHTML2 came into scene; as well as any new content using older doctypes (or no doctype at all).

Again, if well-formedness is an issue for a given document, the best thing the author can do is to not put any kind of doctype or otherwise claiming to adhere to a specification it doesn't really comply with.

The problem with the fail on error cost in XHTML is it falls on the wrong person - the visitor, not the author. The author can be completely unaware of the error for some time - that's the problem.

Once again: IMO, authors who don't even bother to check their pages don't deserve their pages to be viewable at all. And still, as long as they don't claim some fake standard-compliance, their pages are viewable anyway.

Well, whether you like them or not, these are my opinions. I'll respect yours, but I'm not too likely to change my mind.

EDIT: woops, forgot to mention that part:
Adding an XForm to a page that has a simple search form in the page header at the top. This is a quite common requirement. <input> is not namespaced in XHTML1 or XForms(?)

So, assumming that the page is mostly XHTML1, you should add something like this to your <html> tag:
xmlns:xforms="http://www.w3.org/2002/xforms"
and then use <xforms:input> and the like to refer to XForms specific elements. Of course, you can make the prefix even shorter, such as "xf", and save some typing. I'm not really sure on how you should declare the DTD, but I'll try to find out and post some example :P.

[edited by: Herenvardo at 3:12 pm (utc) on Nov. 4, 2008]

[edited by: eelixduppy at 3:13 pm (utc) on Nov. 5, 2008]
[edit reason] disabled smileys [/edit]

mattur




msg:3780057
 5:20 pm on Nov 4, 2008 (gmt 0)

because HTML5's <p> doesn't allow <b>

I'm not sure how you've arrived at this conclusion. I still don't fully understand - you're saying XHTML5 makes <p><b>...</b></p> non-conformant?!

If I've understood what you're saying about XHTML2 in XHTML1, it is theoretically possible to include XHTML2 functionality in existing well-formed XHTML1 pages delivered as application/xhtml+xml without having to completely re-write them(?)

So that would make upgrading to XHTML2 much easier than I originally stated for sites meeting these conditions, but I don't think this really qualifies as backward compatibility.

But all this is just academic. XHTML2 is a dead parrot, or at least is doing a very good impression of one. Most sites use HTML, not XHTML, and of those that do use XHTML, most are not delivered as application/xhtml+xml and would not parse as XML anyway. Resilience is more useful than brittleness.

HTML5 is not as backwards-compatible as it might seem at first glance - a valid HTML4 page should be conformant HTML5, but some presentational attributes have been removed. But we won't have to re-write markup or change mime-types to add new HTML5 features to existing pages.

How effective the HTML5 effort is at capturing browser innovations and standardising them across different browsers remains to be seen.

a site from such an author [using non-conformant markup] probably doesn't deserve the effort UAs will put in rendering it [...] if all browsers said something like "this is not a webpage, but a crappy non-sense of markup" and refused to render them, I'd be quite happy with that.

Well that's certainly an opinion! Of course it would mean you wouldn't actually be able to participate in this discussion, use Google, Slashdot, BBC...

IMO, authors who don't even bother to check their pages don't deserve their pages to be viewable at all.

Thought Experiment [diveintomark.org]

Herenvardo




msg:3780775
 6:41 pm on Nov 5, 2008 (gmt 0)

I'm not sure how you've arrived at this conclusion. I still don't fully understand - you're saying XHTML5 makes <p><b>...</b></p> non-conformant?!

Ouch! My mistake here: I've just checked the spec and the <b> element has got in (it wasn't the last time I checked for it); which makes the previous example completely invalid :(.
Anyway, just replace <b> with <big> in the example, and it gets valid again (until/unless HTML5 ends up including <big> as well).
Keeping in mind that
This document is changing on a daily if not hourly basis in response to comments and as a general part of its development process.

no examples of this kind can be considered "stable".
The whole idea of the example is that an XHTML1 document including any of the Transitional elements that are not included in (X)HTML5 is not strictly conformant. If the page is served as application/xhtml+xml, which isn't an issue as long as the XHTML1 document is XHTML1-conformant, then it will miserably break when parsed as XHTML5; and X/HTML5-compliant UAs must parse such document as XHTML5 because the spec overrides the XHTML1 namespace to be XHTML5's instead.
The spec goes even further, "attempting" to break every existing XML document (I know this is not intentional, but rather a wrong wording, but still):
When a document is transmitted with an XML MIME type, such as application/xhtml+xml, then it is processed by an XML processor by Web browsers, and treated as an "XHTML5" document.

This literally means that if I serve some kind of XML (MathML, SVG, etc) with a XML generic MIME type, HTML5 UAs should parse it as XHTML5, which would lead to the obvious disaster.
I'm afraid this isn't the best example of backwards compatibility; is it?
Maybe the spec is aimed to be backwards compatible, but that doesn't mean it really is. There is a lot of work required if that goal is to be achieved.
OTOH, I've just noticed the "Controversial" annotations on these parts of the spec. Maybe my feedback on these issues has been even considered, despite the absolute lack of replies? Or, maybe, browser vendors aren't happy with a wording that requires them to miserably break on content they can currently render properly. Who knows. Whatever is the reason, I hope it leads to properly fixing these issues.

If I've understood what you're saying about XHTML2 in XHTML1, it is theoretically possible to include XHTML2 functionality in existing well-formed XHTML1 pages delivered as application/xhtml+xml without having to completely re-write them(?)

That's exactly the whole point of the Modularization thing. Don't take me wrong, however: there is some work required. Namely, you'd need to define a specific DTD for the "combo" of dialects you plan to use; which should be just a stream of "includes" for each module you are using. There is an appendix in the XHTML2 draft dealing with this at [w3.org ]; but the actual DTDs aren't there yet; and they probably won't be added until the spec enters Last Call stage (it would be really messy to keep updating them every time something on the spec changes). This approach has already been used sporadically to serve XHTML1+MathML and XHTML1+XForms by some sites, and picking an arbitrary XHTML2 module would be exactly the same (once the DTD module definitions are published, of course)

But all this is just academic. XHTML2 is a dead parrot, or at least is doing a very good impression of one.

Well, that's a wrong impression; although a quite justified one because the fact that browser vendors do not plan to implement XHTML2 is shadowing the fact that there is not that much required for browser vendors to implement. Actually, the things required from browser vendors are:

  • XML-awareness: Most of browsers (actually, all major desktop browsers, including the Trident (IE's), Mozilla (FF's), WebKit (Safari's and Chrome's), and Opera's rendering engines) already do.
  • Default rendering for each element, as defined by [w3.org ]. Even if broswers refuse to implement this (which would be trivial anyway), documents can workaround it by adding a <?xml-stylesheet ...?> processing instruction to apply that styling, which would do the job on any CSS2-compliant browser (and, with IE8 entering the scene, this already includes all major players).
  • Specific functionalities such as XForms, XMLEvents, or Ruby: these are quite tougher, and are the only point where real work by browser vendors is actually involved. OTOH, it's possible (although probably not easy) to trick those as well, via XSLT, transforming some stuff:

  • XForm stuff would be transformed to "classic" forms implementing the layout, plus some javascript implementing the validation and data-model part, as well as forcing the submission to be XML (XForms are supposed to be submited as XML documents).
  • All the XMLEvents stuff would be handled by JavaScript, which is the way events are currently handled.
  • Other stuff such as Ruby could be laid out using some "ugly" markup (such as tables for layout).

So, if you can deal with the "well-formedness" requirement, you can manage to perfectly render an XHTML2 document on current browsers (IE users would need IE8beta, for decent CSS2 support). OTOH, the last time I visited a page using the HTML5ish <canvas> (it was just last week, a JavaScript-based 8080 emulator to be more precisse), my recently updated FF 3.0.3 displayed a comment like "you need to use [unstable] nightly builds of Firefox or Webkit that support the <canvas> element to view this page." So, I'd rather say that, from the authoring usability PoV, XHTML2 is quite more alive than HTML5 ;).

Resilience is more useful than brittleness.

I guess you are assuming the well-formedness requirement automatically leads to brittleness. If that's the case, I can only say that such assumption would be wrong: although it applies for many cases (probably the vast majority), that's simply a missuse of XML. I serve HTML4 Transitional pages as "text/html"; but when I'm testing them I feed my server scripts with a testing=1 argument in the URL that makes them go through as XHTML1.1 (unlike 1.0, 1.1 has only the "strict" flavor), served as "application/xhtml+xml"; and this allows me to catch most issues before my visitors have a chance to notice them. I top this up by making both versions (the X and soup ones) and the CSS go through W3C validators, run a few automated checks that tells me if I'm using some CSS property or HTML attribute not supported by any major browser; and finally test the pages on my five browsers for "visual confirmation".
So, despite miss-used XML leads to brittleness, I'm using it to add an extra layer of resilience :P
Anyway, I definitely agree with the actual fact that resilience is more useful than brittleness ;).

How effective the HTML5 effort is at capturing browser innovations and standardising them across different browsers remains to be seen.

I really think HTML5 will be quite good at that. What I'm wondering if how good will it be at solving web author's need; and I don't have so much hope on this aspect anymore :(.

Well that's certainly an opinion! Of course it would mean you wouldn't actually be able to participate in this discussion, use Google, Slashdot, BBC...

You are taking my comments completely out of context ¨¨. And, if you are really doing so unadvertedly, then either my communication skills or your understanding ability (most probably the former) need a "major version upgrade".
So let me try to "upgrade" my communication skills and clarify the points:
One of the points was that if an author doesn't even care to test her/his pages, then such pages don't deserve to be seen. If browsers were able to detect such blatant lazyness and carelessness and prevented users from stumbling upon it, I'd be really happy.
The other point was that, for sites that need to include external, unverifiable content, I think that tag-soup mode is appropriate and definitely better than draconic strictness. IMHO, the ideal approach is combining XHTML, quinish XSLT, and CDATA blocks (containing the "dangerous" content), so you can get draconic error handling apply only to your own content (which is really useful for testing), and then XSL-Transforming the content to "tag-soup mode" for actual rendering; but I'm Ok with plainly going for lenient HTML as a simplification of this, as long as you do some basic testing on your content templates.
Most of the examples you mentioned, plus the one of the "Thought Experiment" you linked, fall in the category of "need to include external unverifiable content". For the case of these boards' highly presentational markup; I bet that if browsers didn't render it, it would be fixed by the people running this place, probably before we even noticed the issue.

To put some order into the topic, this is a summary about my opinion of XML on the Web, and of the XHTML2 vs X/HTML5 clash:

  1. Creating strict markup specifications (XHTML and, to some degree, HTML's Strict doctype), was a good idea to help the Web evolve towards well-formedness, standard-compliance, and, most important, interoperability.
  2. To keep lenient specifications (HTML4, and the Transitional doctypes) was a good idea to ensure this evolution could happen gradually and smoothly.
  3. To drop error-tolerance with XHTML2 was, IMHO, based on the assumption that the web is entirely ready for strict validity requirements, which is a wrong assumption: only a small fraction of the web is ready for that; so the choice was simply bad (in other words, XHTML2 should have provided a "tag-soup" equivalent, and maybe even "Transitional" doctypes).
  4. To deafly ignore the feedback about the above issue for years was definitely a fatal mistake. The initial work on "WebForms 2" and "WebControls 1.0" (the drafts that finally led to HTML5) is a quite reasonable response.
  5. The close-minded browser-centric XHTML2-phobic approach of the WHATWG (the folks behind HTML5) is, in the best case, irrational.
  6. The mix'in format of the HTML5 spec, tossing together markup, DOM, error recovery, and some other aspects, is not useful: separate issues should be solved through separate specs, so each one can focus on its issue. It's obvious that there needs to be some coordination between them, and I won't deny that; but they are closely tied to a degree that becomes counter-productive.
  7. The fact that XHTML2 sucks in some aspects doesn't alter the fact that HTML5 sucks in so many others. It doesn't help that very few people on each side admit these facts.
  8. I'm strongly convinced that, if the XHTML WG and the WHATWG agreed to work together, we'd end up with great specs that would help evolve the Web towards its full potential (which is probably infinite). I've even thought to lock them all in a room until they decide to speak to each other... and to toss in some hungry cats if they take too long to settle; but I couldn't do that to the poor kittens :(.

Well, I guess that's been quite enough for a single post :P

mattur




msg:3781240
 1:59 pm on Nov 6, 2008 (gmt 0)

The whole idea of the example is that an XHTML1 document including any of the Transitional elements that are not included in (X)HTML5 is not strictly conformant. If the page is served as application/xhtml+xml, which isn't an issue as long as the XHTML1 document is XHTML1-conformant, then it will miserably break when parsed as XHTML5;

Using legacy elements that are not in HTML5 will not break the page because browsers do not use validating parsers. Only well-formedness errors cause XHTML pages to break miserably when delivered as XML.

Well, that's a wrong impression

The war over the next version of HTML markup is over. Rightly or wrongly, the next version of (X)HTML will be (X)HTML5.

the last time I visited a page using the HTML5ish <canvas>... my recently updated FF 3.0.3 displayed a comment like "you need to use [unstable] nightly builds of Firefox or Webkit that support the <canvas> element to view this page." So, I'd rather say that, from the authoring usability PoV, XHTML2 is quite more alive than HTML5

Firefox currently offers partial support for <canvas>. Of course, we can't expect new features to work in browsers until they are implemented(!) and this applies as much to new XHTML2 features as it does to new HTML5 features.

Browsers are slowly implementing HTML5. They're not implementing XHTML2. How does this make XHTML2 "more alive"?

So, if you can deal with the "well-formedness" requirement, you can manage to perfectly render an XHTML2 document on current browsers (IE users would need IE8beta, for decent CSS2 support)

That's a strange definition of current browsers ;)

Herenvardo




msg:3781302
 3:19 pm on Nov 6, 2008 (gmt 0)

Using legacy elements that are not in HTML5 will not break the page because browsers do not use validating parsers. Only well-formedness errors cause XHTML pages to break miserably when delivered as XML.

Oops. I just checked it, and I must admit that you are right (at least on FF). So I guess browsers not fully complying with the XML specs is a good thing.
However, a spec relying on browser's non-compliance for it to work is rather a bad idea.

The war over the next version of HTML markup is over. Rightly or wrongly, the next version of (X)HTML will be (X)HTML5.

Nope, it isn't: even better, with browser vendors so convinced to implement HTML5, the final choice will fall upon us, webmasters; since we don't need them to implement XHTML2 to use it.
So, IMHO, the war is more heated than ever.

Firefox currently offers partial support for <canvas>. Of course, we can't expect new features to work in browsers until they are implemented(!) and this applies as much to new XHTML2 features as it does to new HTML5 features.

No, it doesn' apply to XHTML2 as much as it does to HTML5. Surely, some collaboration on the browser's side would make things easier; but they are already possible. A XHTML2 document (that doesn't use stylesheets) could be perfectly rendered with just a bit of XSLT-tricking on IE6, FF1, WK (not sure from which version, I'm not too familiar with Safari or Konkeror), and Opera7.

Browsers are slowly implementing HTML5. They're not implementing XHTML2. How does this make XHTML2 "more alive"?

Quite simply: browsers are slowly implementing HTML5. They already implemented XHTML2 years ago, without even knowing it, and before XHTML2 even existed, when they implemented XML and XSLT.

So, if you can deal with the "well-formedness" requirement, you can manage to perfectly render an XHTML2 document on current browsers (IE users would need IE8beta, for decent CSS2 support)

That's a strange definition of current browsers ;)


Definitely, I worded it horribly. As mentioned above, you can make it work even on IE6 (IE5.5 was so "eager" to implement XSLT that they shipped it before XSLT became a standard, and there are some incompatibilities between that version and the final spec... it's probably possible to make the sheet work on that IE as well, but I don't think the benefit, if any, is worth the headache). IE8 would only be needed if you want to use CSS2 stylesheets. You'd need IE8 anyway to be able to use most of CSS2 on current doctypes, such as HTML4 or XHTML1. If you can stay with just CSS1, then you don't need IE8 at all, IE6 is quite enough.
So, do you like "IE6+, FF1+, WK, and O7+" more as a definition of "current browsers"? Of course, you need a browser that supports CSS2 if you want to use CSS2 itself with the XHTML2 pages, but you have enough with XSLT support to render the XHTML2 documents themselves.
Comparing this with the HTML5 spec: there is no sane way in which <canvas> could be made to work in current/older browsers, for example (maybe you can trick most of it via JavaScript, but making such a thing work cross-browser definitely is not "sane"). And some stuff, like seeking within <audio> and <video>, is plainly impossible. You need to wait for browser support to use HTML5, but you don't need to wait to use XHTML2 (and, since browsers aren't planning to support it anyway, it wouldn't make too much sense to wait; would it?).

Bert36




msg:3781349
 4:15 pm on Nov 6, 2008 (gmt 0)

LOL herenvardo, I just realized you are right. There is very little in the XHTML2 spec (if any at all, I can't think of anything) that doesn't work in todays browsers.
ROFL, *what* are we waiting for indeed?!

I feel like a little kid waiting for the teacher to tell him to go ahead.

C'mon people, download that spec and use it! LOL

mattur




msg:3781414
 6:03 pm on Nov 6, 2008 (gmt 0)

the final choice will fall upon us, webmasters; since we don't need [browsers] to implement XHTML2 to use it....

...as XHTML1.

They already implemented XHTML2 years ago, without even knowing it, and before XHTML2 even existed, when they implemented XML and XSLT.

You listed in your previous post what browser vendors would be required to do to support XHTML2.

It is possible to use XSLT to transform an XHTML2 document into an XHTML1 document. It would be a completely pointless waste of time, since you're not getting anything new that isn't already in XHTML1, but it is possible [w3future.com].

Support for XML and XSLT does not mean browsers magically support rendering SVG, or handling XForms, or rendering and interacting with XHTML2 pages. New functionality, even if XML-based, has to be implemented.

there is no sane way in which <canvas> could be made to work in current/older browsers. You need to wait for browser support to use HTML5, but you don't need to wait to use XHTML2

You're not comparing like with like. You're comparing the lack of support for entirely new functionality in HTML5, with the presence of support for existing (XHTML1) functionality in XHTML2 when it is transformed into XHTML1 i.e. with no new functionality.

By the same measure, we can use HTML5 now if we only use HTML4 (i.e. currently supported) features. No re-writing, no XSL, just change the doctype. This too, of course, would be completely pointless.

mattur




msg:3781450
 6:47 pm on Nov 6, 2008 (gmt 0)

There is very little in the XHTML2 spec (if any at all, I can't think of anything) that doesn't work in todays browsers. ROFL, *what* are we waiting for indeed?!

The *new* functionality. Anything new that isn't currently supported in HTML4/XHTML1 in current browsers won't work. Minor things like XForms...

[Repeatedly bangs head on desk...]

C'mon people, download that spec and use it!

Don't do this.

Bert36




msg:3781453
 6:50 pm on Nov 6, 2008 (gmt 0)

Oh sure...XForms... Would be nice. But frankly I hardly ever use forms on my sites. So you suggest I wait for something I hardly use?

mattur




msg:3781460
 7:10 pm on Nov 6, 2008 (gmt 0)

Hi Bert,

You can either publish XHTML2 pages as XML with CSS, and use XBL and proprietary browser extensions to make anything interactive work (including links).

Or you can publish your pages as XHTML2 with an XSL transform to XHTML1, which of course means you're just delivering XHTML1 in a pointlessly complicated way.

Both methods offer a range of drawbacks and I can't see any benefits. What's stopping you? ;)

Bert36




msg:3781489
 7:58 pm on Nov 6, 2008 (gmt 0)

I agree with the fact that there are drawbacks and I can't see any benefits either. But that was not why I suggested (jokingly) we should start using it.
The point is, we can use it. (how, why and how browsers deal with it is arbitrary from a designers point of view, developers may see it differently.)

Herenvardo




msg:3781543
 9:15 pm on Nov 6, 2008 (gmt 0)

Support for XML and XSLT does not mean browsers magically support rendering SVG, or handling XForms, or rendering and interacting with XHTML2 pages. New functionality, even if XML-based, has to be implemented.

Yes, of course. New functionality has to be implemented. But this doesn't mean that it has to be implemented by the browser.
XForms and XMLEvents, even if not trivially, can be implemented by authors using XSLT. And this does provide new functionality, such as defining how a form looks like and how it is sent to the server separately; or being able to use href anywhere you want.
About SVG, I'll just say that it's not really part of XHTML; although there are plugins to render it for most browsers, and even some begin to natively support it.
Also, keep in mind that the XSLT would only need to be written once; and then you would save the pain of all the legacy markup ugliness on every new page you write. Depending on how many pages someone plans to write, staying with XHTML1 or HTML4 may become the pontlessly complicated way.
The point is, we can use it.

That is the point. Or, more specifically, the point is that
authors do have a choice. And, BTW, a quite broader one than you could expect: the same way XHTML2 can be "implemented" via XSLT to seamlessly render in browsers with no knowledge at all about XHTML2 itself; the same can be done for any arbitrary XML dialect. For example, if I were going to implement such stylesheet for my own use, I'd probably get rid of the <h1>...<h6>, <a>, and other elements that don't make any sense in XHTML2; and maybe add my own: after all, I don't need browsers to understand it, because I'll be giving them a "dictionary" to translate it to something they already understand.
IMHO, this is exactly what the "extensible" in "Extensible Markup Language" has always stood for.
I said some days ago that we wouldn't have a choice on what to use because browsers aren't going to implement XHTML2. Definitely, I was wrong: while browsers implicitly implement stuff without even realizing, we have a really open choice :D
Oh, and realizing all this stuff I'm getting quite excited about HTML5! As soon as the <canvas> element becomes widespread, we'll be able to "implement" SVG, and even X3D, on our own, without having to rely on plugins or on browsers supporting these formats. :)
I think I'm starting to love XML more than ever.

mattur




msg:3781596
 11:03 pm on Nov 6, 2008 (gmt 0)

XForms and XMLEvents, even if not trivially, can be implemented by authors using XSLT.

XSLT can't implement behaviour.

The point is, we can use [XHTML2].

How? Why?

Herenvardo




msg:3781628
 12:31 am on Nov 7, 2008 (gmt 0)

This is becomming an endless loop... almost everything I'm answering in my latest posts was already mentioned in earlier ones. Anyway:
XSLT can't implement behaviour.

Of course not by itself. XSLT can only implement transformations. But that's enough, because we can transform an XHTML2 document into a HTML4 one that uses classic <input> to implement XForms's layout and <script> elements to deal with XForms's data model and XMLEvents's behavior. So, while it is true that "XSLT can't implement behavior", it can transform a XHTML2 document into a HTML+JavaScript based implementation (or emulation, if you prefer it) of XHTML2.
How?

See the comment above, and my previous posts.
Why?

The actual reasons don't really matter that much: the point is that, if someone has good enough reasons, then that someone has the option to use it. It's all about being able to chose. The actual reasons are up to each person who decides to use it. Some examples of such reasons could be a clean content model, richer semantics, better fallback for images, a saner representation for complex and/or multi-stage forms, etc. Of course, there is the substantial drawback of draconic error handling, as well as some potential issues with SEs; so it's all a matter of balancing the benefits and the drawbacks on a case-by-case basis, and chosing the approach that better solves a site's needs. Again, it is all a matter of choice. As long as we have a choice, the spec that wins this "war" will do so on its own merits. As I see it, there are many things that could happen, including (in no particular order):
  • XHTML2 wins: most of the web migrates to it, and at a certain point browsers will be pressed by the market to natively support it.
  • HTML5 wins: most of the web adopts it, and XHTML2 sinks into oblivion.
  • Both specs win: different sectors adopt one or the other based on their different needs. If both "sides" are significantly big, browsers will be pressed to support both specs. (IMHO, this is the most likely to happen.)
  • Both specs lose: authors don't see any real benefit on one or the other, and stick to HTML4/XHTML1, sporadically stepping into some feature of these specs. This is not too different with what it's already happening, with some XHTML1+XForms pages out there on one side, and some "experimental" and "demo" pages that are already tinkering with early implementations of <canvas>.
  • Something else not included in this list (maybe an HTML6 or XHTML3 is defined before any of these two get adopted, for example). Who knows?
  • Some mix of all the above.

mattur




msg:3781707
 3:55 am on Nov 7, 2008 (gmt 0)

we can transform an XHTML2 document into a HTML4 one that uses classic <input> to implement XForms's layout and <script> elements to deal with XForms's data model and XMLEvents's behavior

So, after writing our XHTML2 content and the relevant transforms, then writing some Javascript libraries to translate XHTML2, XForms and XMLEvents into HTML4, we now have all the features of HTML4 at our disposal?

Couldn't we just extend HTML4 and call it something like, erm... HTML5?

Herenvardo




msg:3781962
 2:30 pm on Nov 7, 2008 (gmt 0)

So, after writing our XHTML2 content and the relevant transforms, then writing some Javascript libraries to translate XHTML2, XForms and XMLEvents into HTML4, we now have all the features of HTML4 at our disposal?

No. After doing that we have the features of XHTML2 at our disposal. To get the features of HTML4, it'd be quite simpler to use HTML4.
The idea is about using what browsers already implement (like the bugged flavor of HTML4 on common use, the unconsistent JavaScript, and CSS1 + a subset of CSS2), to implement the new stuff of XHTML2 on top of it.

Following the PoV you are showing in that post (if I understood it correctly), C and C++ would be completely useless: after all, they don't give us anything we can't do by hacking the machine-code through an hex editor, do they? Furthermore, compiling a C or C++ source is exactly converting it to its machine-code equivalent. These languages don't provide anything, but they are much more widely used than hard-typed binary code
In the line of this (quite close) metaphor, I hope you can see that the idea I'm speaking about is nothing else than a XSLT-based XHTML2 compiler.
Couldn't we just extend HTML4 and call it something like, erm... HTML5?

:o OMG! By all means, please, no. Don't extend anymore! If we really need some change on the markup language itself, that's a clean-up, no more bloating. We need to clean-up all the "extensions" browsers have been putting into HTML since the first browser wars began; we don't need browser makers consolidating such extensions in the form of a standard.
I know, I know, we will never get rid of all the presentational aberrations 3.2 brought in: there are some pages out there with stuff like

<blink><marquee><p><b><b>Look at meeee!</p></blink></b>

that are no longer maintained, and will stay there for many, many years, without anyone taking care to make it more accessible or less eye-bleeding... and I'm ok with browsers, SEs, and other UAs trying to make the sense of such stuff. But we don't want new content to be written like that. Really.
Maybe the HTML5 gruop was right in that the web needs evolution rather than revolution. But we need, after all, evolution. And evolution should go forwards, not backwards or sideways.
Let me put an example: I really like of defining a decent DOM that works cross-browser (well, the W3C had already done that a few times, it's a matter that maybe now browsers will implement it, since they are defining it). I also like the idea of consisten error-recovery mechanisms. And I like the <canvas> element: its addition is an example of evolution of the web.
But, keep a closer look: what's the point of redefining <i>, <b>, <small>, and many other presentational stuff? Does the group really believe that defining new semantics for them and saying "using this only for presentation is non-conformant" will prevent users from using them presnetationally? Come on! <blink> and <marquee> were never conformant, and some authors still use them. And, in addition, the spec is defining down to the deepest detail how non-conformant documents are to be handled. Saying "Doing [name any arbitrary aberration here] is non-conformant" won't help at all: those of us who care about sane and accessible markup wouldn't do it anyway; and those who don't care will do it anyway; after all the error-handling part is removing the worries about conformity from the issues. Furthermore, it will even backfire: some users who don't care about conformity but didn't know about these elements will now find about them; and when testing them in browsers they will see they get the rendering they want...
Another example: what are <video> and <audio> for? What do they provide the <object> element (with an appropriate @type) doesn't? Non-sense bloating.
And, of course, I have to grant the big prize to the most point-less element in HTML's history: <mark>. What is supposed to mean marking something in a markup language? Can't anybody see that everything in an HTML document is marked? That's the whole point of HTML. Maybe something like <highlight>, or <hl> to make shorter (ok, bad idea, it looks identical to <h1> in some fonts) would have made sense; but <mark> as currently defined is less semantic than even <span>.

Yes, sure, there are some good points behind the HTML5 work. But they are messing things up so much that I simply can't believe they are taking it seriously enough.

pageoneresults




msg:3781966
 2:36 pm on Nov 7, 2008 (gmt 0)

there are some pages out there with stuff like <blink><p>Look at meeee!</p></blink>

Watch it! I use that <blink> element when appropriate. Can you believe Firefox still support it? Ya, I use it because I can and mainly because it annoys the heck out of certain visitors. ;)

<blink>Click here!</blink>

That <blink> element rocks! :)

Herenvardo




msg:3782021
 3:38 pm on Nov 7, 2008 (gmt 0)

At least your code snippet is well-formed :P
And, actually, if you are using it with the deliberate intention of annoy users (even if just some subset of them), then you are appropriately using it.
The main issue is those sites that put <blink>s and <marquee>s around everything, with a background consisting of a tiled animated gif that blinks between high-contrast colors, topped up with a <bgsound>; whose authors are convinced that this is exactly what makes their content-less, useless standalone html file the "best website out there"; and would rather bleed to death through their eyes looking at it than admit their "site" causes eye-bleeding. And this is taken from real-world examples (with the only exception of the "bleed to death" part, which is a bit of an exageration of the real "whatch until the headache becomes unbearable").

Do we want the whole WWW to become something like that?

Bert36




msg:3782061
 4:35 pm on Nov 7, 2008 (gmt 0)

All the eye-bleeding set aside - my what a mess ;-) - I think people have forgotten what the internet is actually for. That is not to say that the internet has a specific meaning in life, but I would like to argue it at least exists *despite* UAs (and not because of).

When TV was introduced to the world it was -at first- seen as an educational tool. And when the world-wide-network, currently know as "the internet" was introduced to the public, it was seen as a tool to store human knowledge and retrieve it at any time and any place. Some even thought it might bring equality amongst nations.

One would think by now we know human nature is to abuse our own inventions. Unfortunately, it is also human nature to not admit that. The main problem we are discussing here IMO, is the fact that people "specialise" and group together. Each defining it's own goals and visions.
On the whatwg list I argued that browser vendors should not concern themselves with the naming and defining of markup elements. That this in fact should be the job for linguists, typographs, authors, and archivists. The only response so-far indicates but one argument: historic precedence. And then this is quickly associated with "backwards compatibility".

I will not pretend to have technical knowledge of browsers, In my time we programmed everything in assembly and we did not have to deal with oop-soup. (sub-routines were good enough for us ;-) ).
But I do know when I see people getting swept along with the tides of enthusiasm. I am sure it was a good idea at the time to try and unite the browsers, but in their enthusiasm, they gave themselves the right to "lay down the line" so to speak, on subjects that are only tangentially related to the subject of programming browsers.

HTML has been derived from SGML, and that was here long before anybody had ever heard of a browser. We are talking about a markup language here, but it is being turned into a guideline on how to build a browser. The two subjects are only related because of their mutual dependants.

Bottom line is, the X/HTML5 spec will be nothing more than a set of "rules" that tell us only one thing: "this is what browsers can do, and even if they could do more, they won't". And this is, I think, sad. Because it means the internet is being chained. Browsers will determine from now on, what the technical and creative limitations on the net are.

[edited by: Bert36 at 4:38 pm (utc) on Nov. 7, 2008]

mattur




msg:3782157
 6:54 pm on Nov 7, 2008 (gmt 0)

a XSLT-based XHTML2 compiler

Right, so now we know exactly what we would need to be able to use XHTML2.

Browsers don't natively support XHTML2, and we can't just send XML+CSS, because it doesn't give us behaviour. XBL could give us behaviour but it isn't supported in IE, so we can't currently use this as a solution.

We can transform XHTML2 into HTML4 in the browser using XSLT. We can write Javascript libraries to extend browser functionality.

So if it's possible to write a XHTML2/XForms/XMLEvents/CSS/Xetc to HTML4/Javascript/CSS compiler in XSLT, and someone writes one, then we could use XHTML2 in current browsers.

mattur




msg:3782158
 6:56 pm on Nov 7, 2008 (gmt 0)

Browsers will determine from now on, what the technical and creative limitations on the [web] are.

This has always been the case.

Bert36




msg:3782207
 8:00 pm on Nov 7, 2008 (gmt 0)


This has always been the case.

Yes, but at least there was the potential to it not being the case. This is no more.

Herenvardo




msg:3782247
 9:00 pm on Nov 7, 2008 (gmt 0)

Well, I won't enter into the topic of what "the internet" is, or what it's supposed to be, nor the topic of human nature: these are definitely worth discussing; but due to their extreme complexity they'd deserve their own threads, or even their very own forums.
Going deeper into specifics and into the main topic:
On the whatwg list I argued that browser vendors should not concern themselves with the naming and defining of markup elements. That this in fact should be the job for linguists, typographs, authors, and archivists. The only response so-far indicates but one argument: historic precedence. And then this is quickly associated with "backwards compatibility".

I must say that I have closely watched that topic in the lists, and I'd like to point out some details. But before that, I aknowledge that not everybody participating in this discussion is also participating in the lists so, to help putting some context to this, here is the archived thread from the mailing lists: [lists.whatwg.org ] (use the "previous message" and "next message" to navigate through the thread).
First, backwards compatibility is a legitimate concern: there are billions of documents already in the WWW, and browsers simply can't afford stop rendering them.
Second, I'm afraid you aren't keeping up with the latest messages in the list. Philipp Serafin made a reply worth reading, a few hours later; which I'll try to summarize:
Basically, it comes down to the fact that something like the "<reference class="abbreviation" ...>" originally suggested (as a rough example of a broader idea, if I understood your whole message) doesn't raise any compatibility issue, simply because if you put it into an HTML4 and style it with CSS, current browsers will handle it without issues. There is actually no implementation required by browsers themselves beyond semantics; but browsers currently don't care at all about semantics, so it's irrelevant. He also made a comment on the suggestion's usefulness, which I'll quote rather than paraphrase:
And yes, I do believe <reference class="abbreviation" ...> would be
easier [than <abbr>], because then web authors wouldn't need to remember which
semantic content can be described by tags and which needs custom
classes.

Most of that message provides, IMHO, quite good points. The original thread received several replies during Nov, 5th (I've counted up to 6 messages within less than 10 hours); but suddenly after Philipp's reply the activity has suddenly stopped: not a single reply in more than 48 hours :o
Coincidence? I'd think so if it had just happenned once or twice; but it's happenning much more often. Since I joined the lists in August, this has been happenning every time someone made a good point to change the spec that was intended to something else than making it easier for browsers to implement.
This is, in the best case, frustrating.

HTML has been derived from SGML, and that was here long before anybody had ever heard of a browser.

This is, strictly speaking, true. But, if HTML had been defined by the W3C, and under their current standarization process (which was reviewed due to the CSS2 failure), then it wouldn't have reached the "Recommendation" status as of now, mostly due the "at least two independent, interoperable, publicly available implementations": the only "publicly available implementation" that actually treats HTML as an SGML application is the W3C's markup validator tool.
And it's not just a matter of implementations (read: browsers), but also usage: if there were a browser that strictly applied SGML rules at least 90% (and this is a very conservative estimation) of the pages currently on the WWW would break to some degree; with nefast effects on XHTML pages served as "text/html". This means that authors do not treat HTML as an SGML application (the empty tag syntax from XML is interpreted in an incompatible different way by SGML's "short-hand tags" rules).
In addition, although SGML is indeed older than browsers; HTML isn't. Actually, the first drafts of HTML were based on common usage and, especially, on NCSA Mosaic's implementation of the yet non-standarized format (according to Wikipedia [en.wikipedia.org]). So, it was actually a mistake to define it as SGML, because the implementation on which it was based didn't treat it as such.
If some author tried to use SGML-specific features, such as short-hand tags (against which even the validator raises warnings), s/he would end up frustrated by the fact that this SGML stuff doesn't work anywhere.
After 15 years, is quite about time to fix this original mistake: any further specification of HTML should not define it as SGML, because it is neither treated as such by implementations, nor used as such by authors. On this aspect, the HTML5 group has got it right, so the spec actually requires something like <br /> to be treated as <br> rather than as some obscure shorthand.

We are talking about a markup language here, but it is being turned into a guideline on how to build a browser.

100% agree. This is one of the major flaws in the HTML5 spec. Sure, such a "guideline on how to build a browser" should be defined somewhere, but a HTML spec is simply the wrong place for it. Any kind of formal specification needs to address all the involved sectors equally and neutrally. In the HTML case, this means using a wording that is neutral towards browser, content authors, search engines, assistive tools, authoring tools, validators, and so on. For example, syntaxes should be defined in declarative terms, in contrast with the "parsing algorythm" approach currently taken: why in hell would an author or a WYSIWYG editor care about a 30-step parsing algorythm when dealing with floating point numbers? However, most syntaxes are only defined in terms of a parsing algorythm.

Browsers will determine from now on, what the technical and creative limitations on the [web] are.

This has always been the case.


Not really. Although browsers have played an undeniably important role, it's been more a push and tow between content authors and browser vendors: especially during the first browser wars, each "best viewed in Netscape" logo forced IE to implement something, and each "best viewed in IE" forced Netscape to do the same. There are also other factors, like SEO affecting what authors do, and the market affecting what each browser implements.

So if it's possible to write a XHTML2/XForms/XMLEvents/CSS/Xetc to HTML4/Javascript/CSS compiler in XSLT, and someone writes one, then we could use XHTML2 in current browsers.

Well, that's far more of what you can get from HTML5: even if the most optimistic forecasts are fulfilled, by 2022 we will have a couple of browsers than can cope with documents in HTML5. Currently we have four major rendering engines (IE's Trident, Mozilla, Webkit, and Opera's); more might pop up during this timeframe; and normally we have to deal with people using older versions of browsers. Expecting HTML5 being reliably usable for mainstream web authoring before 2030 or so would be, in the best case, day-dreaming. And, seeing how has the WWW evolved within 10 years, if we have to wait for over two decades for solving today's needs, we are doomed. And that's exactly what the WHATWG is aiming for. So, facing such doom, stuff like XSLT-based compilers seems quite an attractive alternative to me :P

BTW, I just did some counting throughout the HTML5 spec:
It defines 103 elements, and 240 arguments (34 of which are event hooks; another 8 are "common" (ie: defined for all elements), and among the rest some are a single argument, defined multiple times (and often with different meanings)). That's, IMHO, too bloated.
It also defines over 350 explicit algorithms, plus many implicit ones embeeded within the prose of the spec, split in more than 1.6k steps.
That document is not a markup language definition. It is a flawed browser written in pseudo-code. That's completely useless to anyone else beyond flawed browser vendors.

mattur




msg:3782364
 2:12 am on Nov 8, 2008 (gmt 0)

Sure, such a "guideline on how to build a browser" [i.e. parse HTML] should be defined somewhere

In the HTML *specifications* perhaps? Previous HTML specs have lots of holes and grey areas, so it's possible for two standard-compliant browsers to be incompatible with each other. This is why so much effort is going into defining how HTML should be parsed by browsers in the HTML5 spec.

Well, that's far more of what you can get from HTML5:

We haven't actually established that it is possible to write the mother-of-all-shims to render XHTML2 et al in HTML4, or that it is possible to write an XHTML2/XForms/XMLEvents/CSS/Xetc to HTML4/Javascript/CSS compiler in XSLT. At least some of it is feasible, but it's a non-trivial task isn't it? And, assuming for a moment that all this is possible, we haven't established that anyone is actually working on implementing it.

Plus we can also write shims to implement new HTML5 functionality [code.google.com] in older browsers, *temporarily*, too.

even if the most optimistic forecasts are fulfilled, by 2022 we will have a couple of browsers than can cope with documents in HTML5.

This is a common misconception. From the WHATWG FAQ: [wiki.whatwg.org]

You do not need to wait till HTML5 becomes a recommendation, because that canít happen until after [a minimum of 2] implementations are completely finished.

The 2022 date is Ian Hickson's guesstimate [blogs.techrepublic.com.com] for two fully-compliant HTML5 browsers, feature complete, no bugs, and a full test-suite to measure compliance. The most optimistic forecast for HTML5 REC status is 2010 [w3.org] (as you previously posted).

This "two fully-compliant implementations" is the W3C's standard requirement for a W3C Candidate Recommendation to become a W3C Recommendation. We haven't got two fully-compliant implementations of, or full test suites for, HTML4 or CSS2 yet (and almost certainly never will), but they are both in widespread use.

Bert36




msg:3782481
 10:36 am on Nov 8, 2008 (gmt 0)

In the HTML *specifications* perhaps? Previous HTML specs have lots of holes and grey areas, so it's possible for two standard-compliant browsers to be incompatible with each other.

First of all, why do browsers need to be compatible with each other? As long as they are compatible with HTML/CSS Unless I am missing something here.
The fact that HTML has lots of holes and greyareas I am not going to dispute. I wholeheartedly agree we need a new HTML spec.

BUT:

This is why so much effort is going into defining how HTML should be parsed by browsers in the HTML5 spec.

Which is the point I am trying to make, they are not only defining how HTML should be parsed, but they are also defining what HTML is. It is like having all paint manufacturers defining a standard as to how to to make paint and in the process also define which colours there are, how they are allowed to be mixed and applied.
Or (and I used this analogy on the whatwg-list) word processor manufacturers who define a standard to create a uniform native format, and do so by redefining the dictionary and telling you which grammar rules may and may not be used from now on.

First, backwards compatibility is a legitimate concern: there are billions of documents already in the WWW, and browsers simply can't afford stop rendering them.

True, but they shouldn't have to. This partly what the doctype is for and even so, just because browsers are able to render HTML5 (or LanguageX for all I care) doesn't mean they must stop rendering HTML4/3.2 etc.

So, it was actually a mistake to define it as SGML, because the implementation on which it was based didn't treat it as such.

This goes without saying ;-)

As far as the replies on my argument on the whatwg list are concerned, I did read Philipp's reply and it was the only one that did indeed at least seriously responded to my "Suggestions", but indeed afterthat it went silent.

Like you say:
This is, in the best case, frustrating.

And I was not trying to steer the subject into human nature, but the fact remains that humans want to structurise things, that is what we do. And while doing so we all think we are "right" and they are "wrong" and vise versa. That is the essence of of this problem as well.

So does this make this topic arbitrary? No, I don't think so, the (partial) solution is always to structurise only those things in your own corner, your own line of work. So browser vendors must standardise browsers, authors must standardise markup. (I am cutting corners here, but you get the general idea)

There was one reply early in the discussion on the list, and I can't find it right now so I have to paraphrase, which was: "you are free to define your own standard, just don't expect browsers to render it".
I think that one reply says a lot on many levels.

[edited by: Bert36 at 10:41 am (utc) on Nov. 8, 2008]

Herenvardo




msg:3782652
 6:10 pm on Nov 8, 2008 (gmt 0)

In the HTML *specifications* perhaps?

Absolutely not. The HTML specification must specify HTML, by definition. If it doesn't, then it isn't the HTML specification, regardless of how it is branded or titled. And, by specify HTML, I mean exactly that: it must be the ultimate normative source for each person or entity dealing with HTML (including browsers, authors, SEs, and several more sectors which I'm starting to get tired of listing on almost every post in this discussion). It must be usable by all those sectors, and using a language that is neutral among them is a critical requirement for this.
If the spec can't cope with such a basic requirement, then it is essentially useless as a HTML Specification.
On the general case, it could be reasonable to also include sector-specific appendixes, dealing with stuff like error-handling, explicit parsing algorythms, and so on; but on the case of HTML5 that would make each such appendix longer than the body of the spec itself, so I don't think it's a good idea for this particular case.
A quite sane alternative could be to rename the current spec (I think the former name, "Web Applications 1.0", would be quite a good match, but maybe not the best one); and then title the (relatively small) section dealing with markup itself as "The HTML5 Markup Language". With that, and some improvements on the way things are worded there, that section could claim to be a HTML Specification.
Again, I want to make clear that I'm not complaining about the ideas themselves of defining error-handling or a decent DOM; I'm just trying to make you see that this, despite quite related, is not part of the markup language.

This is why so much effort is going into defining how HTML should be parsed by browsers in the HTML5 spec.

Hmm. I'm not really sure it's worth the effort. Actually, given a conformant (in any of the current formats), most browsers do a quite decent job at rendering it, and it looks the same across them in most aspects (let's leave CSS2 out of this for a moment; since there is a well-known browser doing a far less than decent job when facing it).
Instead, if you take a 100% conformant browser, most of the documents in the WWW would miserably break; because they are not just "non well-formed", but actually "hideously horribly-formed". Maybe they should put some of this effort in defining how documents should be parsed? No matter how many "error-handling" stuff is put on the spec: diabolic markup that scapes all of these rules will sooner or later. Does the spec really define how something like this should be handled?:

<a href="example.com/somewhere><p>Click here to go <a href="example.com/anywhere">anywhere</p>!

There could be more stuff before and after the snippet on the page, but for simplicity let's asume that was the entire file. BTW, notice the deliberately missing closing quote on the first href :P
Really, that "spec" cares so much about defining error-handling that almost forsakes everything else. And the error-handling defined there can't cope with everything. It's simply impossible. I have been programming since I was 8 and, believe me: software can be made fool-proof, but it can't be made noob-proof.

You do not need to wait till HTML5 becomes a recommendation

Come on! A bit of realism, please. We are in the real world, not in the happy-dancing-elves country. CSS2 has been a recommendation for almost 10 years, and still we can't use even a miserable "display: table;". Do you really believe we'll have at least the last (stable) version of each major rendering engine supporting at least a single new feature of HTML5 in a consistent (between them) way before 2020? Chances are quite low. Now add to the mix some legacy browsers (don't forget that IE6 is still 2nd most used browser, close behind IE7, and quite ahead of all non-IE browsers together).
Sure, you do not need to wait till HTML5 becomes a recommendation: you need to wait longer.
The 2022 date is Ian Hickson's guesstimate for two fully-compliant HTML5 browsers [...]

In other words, it's the date for which you can rely on an HTML5 rendering on half of the widely used browsers with a decent chance of it not to miserably break. And even that doesn't mean previous versions of those same browsers immediately disappear from the scene.
The most optimistic forecast for HTML5 REC status is 2010 (as you previously posted)

If you mean the quote from the W3C HTML WG Charter, that's not a "forecast", that's a plain lie. The W3C anounced such dates despite the HTML5 Editor had given them more reasonable estimates; without any aparent reason. These dates are simply fake, and the W3C knows that well enough. I guess they just wanted to beat their record on how late specs come out when compared with the WG's charter, and the record is already high by now (you know, XHTML2 was initally supposed to become a Rec on 2004).
We haven't got two fully-compliant implementations of, or full test suites for, HTML4 or CSS2 yet (and almost certainly never will), but they are both in widespread use.

Well, a subset of those is in widespread use; and it is based on "test suites" done on-the-fly by each author to see whether something works or not. And we are still stuck being unable to use some basic CSS2 features, just because a single vendor was so comfortably sitting on top of its monopoly that they didn't feel like implementing it. Just imagine, as an example, that we had had full test suites, and that both Webkit and Mozilla had implemented it perfectly. That would fulfill nowadays' requirements to reach the Recommendation status, and still we couldn't use those features.

First of all, why do browsers need to be compatible with each other? As long as they are compatible with HTML/CSS Unless I am missing something here.

Well, you are just missing the small detail that over 90% of the current web is a mess of tag-soup without any sense, and only a small fraction of documents are actually compatible with HTML/CSS. As I've mentioned several times, most of the web would be unreadable with a 100% standard-compliant browser.

It is like having all paint manufacturers defining a standard as to how to to make paint and in the process also define which colours there are, how they are allowed to be mixed and applied.

That's a great analogy.
Please, let me add another example: imagine I decide to write a HTML5 browser. Since I'd be happy to release it under GPL, I chose not to reinvent the wheel and take GREP's code for all the parsing. Once I figure out the appropriate regexps to feed to the parser; I might have a quite light-weight browser that passes all the test-suites and still is not "compliant" because it doesn't implement all of those silly hundreds algorythms with their over 1600 steps; instead it achieves the same task with, maybe, 50 algorythms and 600 steps: so, since I did a better job than them, I'm not compliant, despite I pass all the test suites.
This is the problem of writing a spec the way the WHATWG is writing it.
Ian Hickson claimed that XHTML2 was revolution and HTML5 would be evolution. Either he lied, or he was wrong (I prefer to think it's the latter): XHTML2 is evolution, only that in the wrong direction (it focuses on well-formedness before focusing on removing the legacy stuff in HTML); HTML5 is, from the markup language perspective, a step (actually, it'd be a sprint rather than a single step) backwards, plus a lock to ensure the small progress we have achieved througout the last decade and it is undoing will never be recoverable. And just because it's easier for implementors.

Now, quoting the maling lists, On Tue, Nov 4, 2008 at 8:37 PM, Ian Hickson wrote:
The HTML5 spec exists for a number of groups, primarily Web authors (and
people writing tutorials for Web authors), Web browser implementors, tool
implementors, and validator implementors. Roughly in that order.

That's a blatant lie. The only point of (implicit) truth in that statement is the fact that the spec doesn't care at all for web users; which should be, IMHO, the ultimate highest priority (even if for nothing else, simply because they are the overwhelmingly vast majority among the affected people and entities).
Is it?
Actually, I recenly noticed that Ian has been going through a lot of feedback on the WF2 stuff from early 2006 to more recent messages, replying to lots of them (I think this was part of the process of merging the WF2 spec into HTML5).
This serves as a reminder on the actual ammount of work the editor is dealing with: we are around a thousand people in the lists; and Ian has to review everything and take the final decision of what goes into the spec and what doesn't. It's simply impossible for a single person to deal with so much feedback at this pace. Maybe there is still hope to get some sanity through this process and into the final specification, once Ian gets a moment to review our threads. I guess I'll do some threadomancy there, to avoid them sinking into oblivion, and inviting everyone else in the list to provide their technical arguments against the proposals: if such arguments come, then we they should definitely be dealt with; and if they don't, it will be clear that there are no good arguments against the suggestions: then we'll be able to see if it's true that "each proposal is valued only on the weight of its technical arguments, regardless of where it comes from".

Like you say:
This is, in the best case, frustrating.

Definitely, it can be quite frustrating, but don't give up: these people may need our help to fix the mess they are doing ;).

mattur




msg:3783005
 3:19 pm on Nov 9, 2008 (gmt 0)

Bert,

First of all, why do browsers need to be compatible with each other? As long as they are compatible with HTML/CSS Unless I am missing something here.

If browsers are compatible with a standard (and that standard is sufficiently well-defined) then they are compatible with each other. That's the point of standards.

It is like having all paint manufacturers defining a standard as to how to to make paint and in the process also define which colours there are, how they are allowed to be mixed and applied.

We don't generally require interoperable wall colours (paint batch numbers are used for consistency), but there are times when standardised colours are useful (eg Pantone Matching System).

word processor manufacturers who define a standard to create a uniform native format, and do so by redefining the dictionary and telling you which grammar rules may and may not be used from now on.

A standard format would of course have to define how to express a heading, a paragraph, a list etc. It wouldn't require changes to the spelling or grammer used in the content expressed in that standard format, and neither does HTML5.

Bert36




msg:3783018
 3:59 pm on Nov 9, 2008 (gmt 0)

If browsers are compatible with a standard (and that standard is sufficiently well-defined) then they are compatible with each other. That's the point of standards.

I'm sorry, I don't understand. What is there to be compatible about? What do they need to inter-exchange? Let us say that BrowserX and BrowserY are both standards compliant. As I see it, this means they can both render a webpage in such a way that it looks and functions the same in both. But this does not have to mean that they both do it the same way. When the goal of a journey is to arrive in Rome, who cares if you do it by train or bus? The train and bus don't have to be compatible with each other, as long as they both arrive in Rome.

The analogies about paint and word processors are just that; analogies.
...and neither does HTML5.

HTML5 does define which elements/tags we as authors are getting, and how they must be used. But ok, let us say for arguments sake, that this is indeed absolutely necessary for the browser vendors, because otherwise there browsers would not be standards compliant.
What makes them authoroties on how language must be marked-up in a consistent, flexible, practical and most of all realistic way? Someone on the list (I believe it was Ian Hickson) even suggested that semantics are not complex. I beg to differ, semantics, liguistics etc. etc are all university degrees for which people must study years. And that is usually for just one or two languages, let alone for all of the languages on the planet.
But let us pass that as well, what does HTML provide us in terms of the future of the semantic web? Well, more of less everything that was also in HTML4 and a few extra things that function exactly the same as the old ones. I can see no mechanisms at all that take *advanced* AT into account, nor any visionary constructs that would even try to anticipate AI, let alone something as simple as cross-referencing. WHy not? Because they say this can be solved by Javascript and CSS... no, it can not. Because the "solutions" they give us are only interpretable by us humans.
HTML5 started as "Web applications", and it shows. Language has to take a back seat to everything else.
I am sorry, but they have annexed a very important spec and see it as part of another spec, while the two are only tangentially related.

mattur




msg:3783036
 5:11 pm on Nov 9, 2008 (gmt 0)

Herenvardo,

[HTML should be specified] In the HTML *specifications* perhaps?

Absolutely not. The HTML specification must specify HTML, by definition.

We both agree that the HTML spec is where HTML should be specified. I think... ;)

And, by specify HTML, I mean exactly that: it must be the ultimate normative source for each person or entity dealing with HTML (including browsers, authors, SEs, and several more sectors which I'm starting to get tired of listing on almost every post in this discussion). It must be usable by all those sectors, and using a language that is neutral among them is a critical requirement for this....
A quite sane alternative could be to rename the current spec (I think the former name, "Web Applications 1.0", would be quite a good match, but maybe not the best one); and then title the (relatively small) section dealing with markup itself as "The HTML5 Markup Language". With that, and some improvements on the way things are worded there, that section could claim to be a HTML Specification.

If by neutral language you mean non-technical language, you seem to be suggesting taking the spec bits out of the spec to make it a user reference, then putting the actual spec somewhere else(?) I don't think this is necessary, when instead we can just create user references based on the spec.

Come on! A bit of realism, please. We are in the real world, not in the happy-dancing-elves country. CSS2 has been a recommendation for almost 10 years, and still we can't use even a miserable "display: table;". Do you really believe we'll have at least the last (stable) version of each major rendering engine supporting at least a single new feature of HTML5 in a consistent (between them) way before [2022]? Chances are quite low.

You're arguing that not *one* single feature of HTML5 will be implemented because not *all* of CSS2 has been implemented.

In the past 14 years browsers have added a lot of new features including tables, frames, formatting tags, SSL, cookies, Javascript, CSS and XMLHttpRequest.

Do I think we'll see widespread browser support for at least one HTML5 feature in the next 14 years? Yes.

(edited for clarity)

[edited by: mattur at 5:22 pm (utc) on Nov. 9, 2008]

mattur




msg:3783037
 5:11 pm on Nov 9, 2008 (gmt 0)

Bert,

what does HTML provide us in terms of the future of the semantic web? Well, more of less everything that was also in HTML4 and a few extra things that function exactly the same as the old ones. I can see no mechanisms at all that take *advanced* AT into account, nor any visionary constructs that would even try to anticipate AI, let alone something as simple as cross-referencing.

Don't get me started on the Semantic Web... ;)

Bert36




msg:3783051
 5:44 pm on Nov 9, 2008 (gmt 0)

Don't get me started on the Semantic Web... ;)

LOL, I would avoid the topic if I could, but it is at the core of what I am talking about I think.
How am I supposed to make websites where the topics, articles, etc. etc. in other words all text (and images I suppose) are marked up in such a way that in -let us take a big margin- 50 years time intelligent webbots can assist me (or rather my descendants) in their searches on the web.
I know, it is dangerous to paint a picture of the future. 50 years from now the internet may look completely different, or may not exist at all any more. But let us consider the possibility that the internet will follow some scenario we can anticipate based on past experience and present knowledge.
Today we limit the HTML spec because it MUST be backwards compatible, project that point of view into the future. So in 50 years time we have an internet filled with 70+ years of knowledge (and nonsense) stored in a tag-soup which nobody and nothing can retrieve. Today you search Google on "frogs" and you get 15.000.000 hits. In 50 years time you will get a million fold of that. And Google will no more be able to make soup of tags than anything else. Why? because nothing can be marked up consistently and in a "it makes sense" kind of way. Only because in 2010 we got an HTML spec that imprisoned human-language evolution on the web.

This 63 message thread spans 3 pages: < < 63 ( 1 [2] 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved