HTML 4.01 vs XHTML

Forum Moderators: open

Message Too Old, No Replies

HTML 4.01 vs XHTML

sren

8:01 am on Aug 19, 2005 (gmt 0)

In some kind of old "make your bussiness website" books, writers were advising the reader to stay with HTML4, since SE robots used to have some problems crawling trough XHTML code.

Is this problem completly solved now?
Does the spiders likes the XHTML code?

thanks.

BlobFisk

9:56 am on Aug 19, 2005 (gmt 0)

I imagine that they would as it goes for the pure seperation of style from content.

Fixed speeling!

[edited by: BlobFisk at 1:23 pm (utc) on Aug. 19, 2005]

Farix

12:46 pm on Aug 19, 2005 (gmt 0)

At one time, I was all over XHTML because it was the "latest." But looking back on it now, I ask just what the point of XHTML was to begin with. Unless there is a particular reason to use XHTML, it is better to stick with HTML 4.01 Strict and save a few random bites.

encyclo

12:58 pm on Aug 19, 2005 (gmt 0)

There is a strictly theoretical possibility of a problem with trailing slashes, in particular on meta elements in the head section. However, no such problems exist with any of the major search engine spiders which all have no problem parsing XHTML.

One thing that the spiders can't do is parse XHTML when served with the MIME type

application/xhtml+xml

application/xml

- however Internet Explorer can't read such files either, so content served this way is rare. If you are doing it, make sure you are serving

text/html

by default.

For absolute maximum compatibility you can use HTML 4.01 Strict. However, there is no real reason why you can't use XHTML 1.0 syntax if you prefer it. As BlobFisk says, if you are using a strict XHTML DTD, the separation of style from content will make your XHTML page much easier to parse for a spider than a "tag soup" HTML page.

Farix

1:38 pm on Aug 19, 2005 (gmt 0)

For absolute maximum compatibility you can use HTML 4.01 Strict. However, there is no real reason why you can't use XHTML 1.0 syntax if you prefer it. As BlobFisk says, if you are using a strict XHTML DTD, the separation of style from content will make your XHTML page much easier to parse for a spider than a "tag soup" HTML page.

I do not see how XHTML 1.0 is better as it can suffer the same "tag soup" problems as HTML 4.01. Also, you can just as easily separate style from content in HTML 4.01 Strict. So what real and practical advantages does XHTML 1.0 has that do not exist in HTML 4.01?

asquithea

5:49 pm on Aug 19, 2005 (gmt 0)

In practical terms, if you're serving only to a web-browser, HTML 4.01 strict is just fine. I gather that there are some theoretical advantages in embedding advanced content, but it's typically very poorly supported.

If your pages might need to be parsed by something else, however, you should be with XHTML all the way; XML is far easier to parse than SGML.

A quick example: You might be catering for a customized browser (perhaps put out on your company intranet) that offers different views of the same data using built in XSLT transformations.

pixelkat

6:36 am on Aug 20, 2005 (gmt 0)

greetings, all!
XHTML is the cornerstone of cross-platform scalability. combined with css and other markup languages, it will drive the 'write once, use everywhere' objective.
kat

Farix

11:37 am on Aug 20, 2005 (gmt 0)

XHTML is the cornerstone of cross-platform scalability. combined with css and other markup languages, it will drive the 'write once, use everywhere' objective.

How so? This seems to more hype then practical reality. And remember, we are talking about webpages here being served to webbrowsers.

encyclo

12:51 am on Aug 21, 2005 (gmt 0)

There are some edge cases where XHTML 1.0 is easier to use - it has been through a revision (XHTML 1.0 Second Edition) which corrected a few anomalies, for example adding an

id

attribute on the

html

element or (one thing I came across today) allowing a percentage value for the

cols

attribute on a

textarea

. XHTML syntax, which enforces the closing of all elements, also has a distinct advantage with complex data (or layout!) tables where missing end tags (optional in HTML4 so not picked up by the validator) could cause browser layout bugs.

Mostly there is very little practical difference, so the choice between HTML or XHTML syntax is down to personal preference. I tend to use HTML 4.01, but my largest site is based on XHTML 1.0 Transitional.

HTML 4.01 vs XHTML

sren

BlobFisk

Farix

encyclo

Farix

asquithea

pixelkat

Farix

encyclo

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week