Forum Moderators: open

Message Too Old, No Replies

XHTML won't validate as strict

         

sygad

8:14 am on Jun 23, 2005 (gmt 0)

10+ Year Member



Hi All,

Lurker finally makes a post ;->

I have been trying to convert my site to XHTML 1.0 Strict and have problems validating it as true XHTML.

Have been using the following code

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8" />

<meta http-equiv="content-language" content="en" />

Now, I was kind of imagining that this wouldn't display in IE due to the MIME type declaration, but it does....

I was also imagining that when I validated it at w3c with verbose option that it would report my MIME type as above, it doesn't, it reports it as text/html....

Does anyone have any clues where i'm going wrong or have I expected the wrong result?

Cheers for any help.

Hester

8:29 am on Jun 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think you need to set the mime type on the server, else (as you've found) it falls back to text/html (hence it works in IE).

encyclo

11:38 am on Jun 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld [webmasterworld.com] sygad!

Meta tags for declaring the MIME type are meaningless when you are dealing with "true" XHTML - you have to send the MIME type as an HTTP header. There are various ways of doing this. The first is to define the MIME type according to a file extension: for example if you are running Apache you can add this to your root-level .htaccess:

AddType application/xhtml+xml .xhtml

And use the .xhtml extension for your pages. The second (and I find much better) solution is to use a server-side scripting language such as PHP and send the header that way:

<?php header("Content-type: application/xhtml+xml");?>
<?php echo '<?xml version="1.0" encoding="utf-8"?>';?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>test</title>
</head>
<body>
<h1>test MIME type</h1>
<p>This page sent as application/xhtml+xml.</p>
</body>
</html>

Neither way will display in IE, of course - but you can check the

HTTP_ACCEPT
header and serve
text/html
to IE if required.

[edited by: encyclo at 1:17 am (utc) on July 30, 2005]

sygad

12:02 pm on Jun 23, 2005 (gmt 0)

10+ Year Member



Thanks for the reply.

I am using IIS on Win2k.

I have created a MIME type under the HTTP headers tab in IIS using the following details:

Associated extension: .asp
Content type (MIME): application/xhtml+xml

The W3C validator is still saying the content type is text/html......

Is there something i'm missing?

It is my eventual aim to use content negotiation for serving correct XHTML content to those who can understand and text/html to IE.

Every guide I read seems to refer to Apache and PHP, is there a way of doing this in IIS and ASP?

Cheers

encyclo

12:55 pm on Jun 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not in any way an expert in ASP or Windows servers in general so I can't answer about adding the appropriate MIME type in IIS. However if we're talking about ASP then you are using scripting on your pages - and as I said this is usually a much better way of doing this as you would eventually want to check the
HTTP_ACCEPT
.

I searched around for some ASP code for you and I found this:

Response.ContentType = "application/xhtml+xml"

If you add the above to your page's header then it should work. The proposed MIME-type switching mechanism was given as this:

If InStr(Request.ServerVariables("HTTP_ACCEPT"), "application/xhtml+xml") > 0 Then
Response.ContentType = "application/xhtml+xml"
Else
Response.ContentType = "text/html"
End If

However, before switching your site to sending

application/xhtml+xml
to compliant browsers, I would point you to the Mozilla Web Author FAQ [mozilla.org] which states:

(...) if you are using the usual HTML features (no MathML) and are serving your content as text/html to other browsers, there is no need to serve
application/xhtml+xml
to Mozilla. In fact, doing so would deprive the Mozilla users of incremental display, because incremental loading of XML documents has not been implemented yet. Serving valid HTML 4.01 as
text/html
ensures the widest browser and search engine support.

There is a fad of serving text/html to IE but serving the same markup with no added value as
application/xhtml+xml
to Mozilla. This is usually done without a mechanism that would ensure the well-formedness of the served documents. (...) When XHTML output has been retrofitted to a content management system that was not designed for XML from the ground up, the system usually ends up discriminating Mozilla users by serving tag soup labeled as XML to Mozilla (leading to a parse error) and serving the same soup labeled as tag soup to IE (not leading to a parse error).

sygad

3:33 pm on Jun 23, 2005 (gmt 0)

10+ Year Member



Aaaah...

I think the lightbulb has lit.

To summarise

1. Started off with all the XHTML code in my .asp web page - didn't work

2. Created a server MIME type for .asp extension and tried making it output as XHTML, using the same page as above - didn't work

3. Edited the server MIME type for file extension .xhtml and changed my page extension, using the same page - Worked in FF, dropped dead in IE and validated with correct content-type - Nearly perfect, except that it no longer processed my ASP elements......

4. Deleted server MIME type, renamed extension back to .asp, deleted "meta" line of code in my page that declared content-type and replaced with Response.ContentType = "application/xhtml+xml" - worked in FF, dropped dead in IE, processed my ASP - Exactly what I was after.

5. Not done yet, but it will be the negotiation as above.

Cheers for the help so far

Hester

10:46 am on Jun 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is my eventual aim to use content negotiation for serving correct XHTML content to those who can understand and text/html to IE.

What for? Why not just serve text/html to everyone? There is absolutely no real benefit right now to be gained from xhtml/xml - it is a myth.

mattur

12:12 pm on Jun 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've got to agree with Hester here - your time would be better spent on doing something that actually benefits your visitors.

At the moment, the main effect of serving xhtml with the xhtml mime type is to turn off incremental rendering in Firefox/Moz.

sygad

10:32 am on Jun 27, 2005 (gmt 0)

10+ Year Member



Hi All,

Sorry for the late reply, I lost web browsing all weekend, bit of a long post.

XHTML is currently, I agree, quite useless, BUT, we all agree that HTML has a limited lifespan, the fact that so many people like ourselves are investigating

XHTML, shows that we have 1 eye on the future, which traditionally has come around a lot quicker than we would like or have had time to prepare for,

personally I like to keep my skills a little ahead of the curve.

I have a little correction from my earlier post,

1. Need the server MIME type "application/xhtml+xml" declared for asp pages, doesn't do anything without it.
2. Need the META line in my asp page as it only validates as tentatively without it.

This is now the top of my .asp page

----------------------------------------------------------------

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>

<%
If InStr(Request.ServerVariables("HTTP_ACCEPT"), "application/xhtml+xml") > 0 Then
Response.ContentType = "application/xhtml+xml"
Else
Response.ContentType = "text/html"
End If
%>

<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8" />
<meta http-equiv="content-language" content="en" />
<title>Online Investment</title>
<style type="text/css" media="screen">@import url("/library/test.css");</style>
</head>
<body>

----------------------------------------------------------------

I also have a line at the bottom;

<%= request.servervariables("HTTP_ACCEPT") %>

This outputs the string for HTTP_ACCEPT.

When I view the page in IE, it renders fine and the very bottom says, */*
When I veiw the page in FF, it also renders fine and I get at the bottom, text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

Now this tells me that the line at the bottom is working but nothing about the top MIME type switching, I then went to the w3c validator in each browser and tried to validate the page, I assumed that when I tried to validate it in FF, it would tell me that the content-type was xhtml+xml and when I tried in IE that it would report back that the content-type was text/html, just as the ASP code above is set out to do.

Annoyingly this is not the case, both validations report content-type as text/html.

Anyone have any ideas why this might be the case, my first attempt at just setting <%= Response.ContentType = "application/xhtml+xml" %> worked fine, (to an extent), but with the negotiation it doesn't work as expected.

Mattur, can you explain further "turn off incremental rendering in FF/Moz" - I will research it as well but it's always nice to have an interpretation.

Thanks all once again.

mattur

11:14 am on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sygad, see the Mozilla Web Author Faq:


"However, if you are using the usual HTML features (no MathML) and are serving your content as text/html to other browsers, there is no need to serve application/xhtml+xml to Mozilla. In fact, doing so would deprive the Mozilla users of incremental display, because incremental loading of XML documents has not been implemented yet."

[mozilla.org...]

encyclo

1:35 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sygad, you are getting the MIME type
text/html
because you are trying to output HTTP headers after the page has started: you need to set all the headers before anything else is sent.

Secondly, you don't need the meta tag: you can use ASP to set the charset too. A quick search indicates that you can use this:

<%Response.Charset="UTF-8"%>

Again this must go before any content. Same goes for the content language, although I don't know the appropriate VBScript for that (I only do PHP). You are already defining the content language three times (twice on the HTML element and once in the meta tag) so the meta tag can go. Try this:

<%
If InStr(Request.ServerVariables("HTTP_ACCEPT"), "application/xhtml+xml") > 0 Then
Response.ContentType = "application/xhtml+xml"
Else
Response.ContentType = "text/html"
End If
Response.Charset="UTF-8"
%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Online Investment</title>
<style type="text/css" media="screen">@import url("/library/test.css");</style>
</head>
<body>

One weakness with the above code is that (as you have seen from the Firefox output) the HTTP_ACCEPT string shows a quotient of preference for each MIME type: for Mozilla the standard is

application/xhtml+xml
and other XML MIME types first, followed by
text/html
with a quotient of 0.9,
text/plain
with a quotient of 0.8, etc. You current code checks for the presence of
application/xhtml+xml
anywhere within the string, but doesn't check whether it is preferred over other MIME types. For example a user agent which listed
text/html
first and
application/xhtml+xml
with a quotient of 0.9 should be served with
text/html
.

we all agree that HTML has a limited lifespan

Maybe, but it's still going strong and there is a lot of resistance towards the current direction with XHTML and the accompanying draconian client-side error-handling. If it ever came about it would be a huge backwards step for the web: as so many sites accept thrid-party input (comments, forum posts, etc.) if a user succeeds in introducing invalid markup (or even an unrecognized character entity) they could produce a denial-of-service. The only risk with

text/html
is that the page would stop validating.

The browser companies Mozilla, Opera and Apple (Safari) are developing a new HTML specification [whatwg.org] which will become HTML5. Microsoft are not part of that group, but they have clearly indicated that they are not interested in supporting

application/xhtml+xml
in the upcoming IE7, or even beyond.

Having said all that, I don't think it is a bad thing to experiment with MIME type switching as at the very least it helps get a much better understanding of how browsers handle markup and how things such as HTTP headers, HTTP_ACCEPT and such work. Just be careful about where you use it: I would not risk using MIME type switching in a client site or on a money-making site. Personal sites or blogs are another matter, and are the perfect place to experiment.

sygad

2:09 pm on Jun 27, 2005 (gmt 0)

10+ Year Member



Had a read of that article and regret to say that not a lot of it made sense, some bits did, others parts I have read on different website, (always nice to trace something back to its source ;->).

I am trying to write true XHTML with content negotiation to serve the correct MIME type to capable browsers, sorry for the re-iteration of the obvious but sometimes these things can get lost in the length and meanderings of posts.

Any chance you can break it down a little into what it is trying to tell me, real world examples would be very much appreciated.

Cheers

sygad

2:30 pm on Jun 27, 2005 (gmt 0)

10+ Year Member



encyclo, very interesting, was not aware of this "backlash", will have to investigate further.

I was under the impression every one wants to use it but because of the lack of IE support it was waiting in the wings....hmmmm

My self mutterings are taking me in the direction of XML data, CSS and using XSLT to output into XHTML, although I will confess that this is something I have only given a few hours thought too, many, many hours of thinking from now I would like all data to be held independently of structure and all independently of presentation.

Give the programmers the data side of things to take care of and leave the structure and presentation to the designers (i.e. Me).

Ultimately a standards compliant, device & UA independent website, that would be nice!

"God damn it I will get my website working on my PDA and only have 1 set of code to do it" - Me

flashfan

1:23 pm on Jun 28, 2005 (gmt 0)

10+ Year Member



I experienced similar pain. I chose XHTML 1.0 strict as the doc type. Same pages could not be displayed complete, but the html source code is complete. For my case, the html codes are validated by TIDY; I even didn't get a warning. It really drove me crazy for a few days. After applying the "application/xhtml+xml..", thanks god, the pages are correct.

Hester

3:43 pm on Jul 20, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a problem using ASP and XSL. (I'm new to both so forgive me for asking about this.) My code has a form that updates an XML file. But the form gets displayed by ASP with the addition of a META tag! It is not in my original XSL file! This is changing the charset used to UTF-16, but I want it to be UTF-8.

I have tried adding this line mentioned in an earlier post:

<%Response.Charset="UTF-8"%>

But it causes an error (as does adding a DOCTYPE!) which says:

The stylesheet does not contain a document element. The stylesheet may be empty, or it may not be a well-formed XML document.

I'm getting used to seeing this a lot, but what does it actually mean? What "document element"?

Now I think I was wrong to add the code to the XSL template, but then where else can it go? The ASP file just processes the XML and produces the form - it has no HTML in it.

I even tried adding a META tag manually to the template, but ASP still added one if its own! Is there a way to stop this happening?

I am also worried that ASP generates old-school HTML (not XHTML). A search here reveals this is the case with ASP .NET, but not "classic ASP". I'm not sure which I'm using to be honest, but I don't want it to spit out tags with upper case lettering and no end slashes as in the META tag I'm getting. Any advice welcome!