Welcome to WebmasterWorld Guest from 54.167.153.63

Forum Moderators: incrediBILL

Message Too Old, No Replies

HTML 4.01 vs XHTML

   
8:01 am on Aug 19, 2005 (gmt 0)

5+ Year Member



In some kind of old "make your bussiness website" books, writers were advising the reader to stay with HTML4, since SE robots used to have some problems crawling trough XHTML code.

Is this problem completly solved now?
Does the spiders likes the XHTML code?

thanks.

9:56 am on Aug 19, 2005 (gmt 0)

WebmasterWorld Senior Member blobfisk is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I imagine that they would as it goes for the pure seperation of style from content.

Fixed speeling!

[edited by: BlobFisk at 1:23 pm (utc) on Aug. 19, 2005]

12:46 pm on Aug 19, 2005 (gmt 0)

10+ Year Member



At one time, I was all over XHTML because it was the "latest." But looking back on it now, I ask just what the point of XHTML was to begin with. Unless there is a particular reason to use XHTML, it is better to stick with HTML 4.01 Strict and save a few random bites.
12:58 pm on Aug 19, 2005 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There is a strictly theoretical possibility of a problem with trailing slashes, in particular on meta elements in the head section. However, no such problems exist with any of the major search engine spiders which all have no problem parsing XHTML.

One thing that the spiders can't do is parse XHTML when served with the MIME type

application/xhtml+xml
or
application/xml
- however Internet Explorer can't read such files either, so content served this way is rare. If you are doing it, make sure you are serving
text/html
by default.

For absolute maximum compatibility you can use HTML 4.01 Strict. However, there is no real reason why you can't use XHTML 1.0 syntax if you prefer it. As BlobFisk says, if you are using a strict XHTML DTD, the separation of style from content will make your XHTML page much easier to parse for a spider than a "tag soup" HTML page.

1:38 pm on Aug 19, 2005 (gmt 0)

10+ Year Member



For absolute maximum compatibility you can use HTML 4.01 Strict. However, there is no real reason why you can't use XHTML 1.0 syntax if you prefer it. As BlobFisk says, if you are using a strict XHTML DTD, the separation of style from content will make your XHTML page much easier to parse for a spider than a "tag soup" HTML page.

I do not see how XHTML 1.0 is better as it can suffer the same "tag soup" problems as HTML 4.01. Also, you can just as easily separate style from content in HTML 4.01 Strict. So what real and practical advantages does XHTML 1.0 has that do not exist in HTML 4.01?
5:49 pm on Aug 19, 2005 (gmt 0)

10+ Year Member



In practical terms, if you're serving only to a web-browser, HTML 4.01 strict is just fine. I gather that there are some theoretical advantages in embedding advanced content, but it's typically very poorly supported.

If your pages might need to be parsed by something else, however, you should be with XHTML all the way; XML is far easier to parse than SGML.

A quick example: You might be catering for a customized browser (perhaps put out on your company intranet) that offers different views of the same data using built in XSLT transformations.

6:36 am on Aug 20, 2005 (gmt 0)

10+ Year Member



greetings, all!
XHTML is the cornerstone of cross-platform scalability. combined with css and other markup languages, it will drive the 'write once, use everywhere' objective.
kat
11:37 am on Aug 20, 2005 (gmt 0)

10+ Year Member



XHTML is the cornerstone of cross-platform scalability. combined with css and other markup languages, it will drive the 'write once, use everywhere' objective.

How so? This seems to more hype then practical reality. And remember, we are talking about webpages here being served to webbrowsers.
12:51 am on Aug 21, 2005 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There are some edge cases where XHTML 1.0 is easier to use - it has been through a revision (XHTML 1.0 Second Edition) which corrected a few anomalies, for example adding an
id
attribute on the
html
element or (one thing I came across today) allowing a percentage value for the
cols
attribute on a
textarea
. XHTML syntax, which enforces the closing of all elements, also has a distinct advantage with complex data (or layout!) tables where missing end tags (optional in HTML4 so not picked up by the validator) could cause browser layout bugs.

Mostly there is very little practical difference, so the choice between HTML or XHTML syntax is down to personal preference. I tend to use HTML 4.01, but my largest site is based on XHTML 1.0 Transitional.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month