Forum Moderators: mack

Message Too Old, No Replies

General questions about doctype and validating

My pages are working fine without either, but some say it's important . . .

         

MatthewHSE

11:15 pm on Jul 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not exactly a newcomer to HTML or web design, but I must admit that my knowledge is basically limited to a solid understanding of HTML and a good working knowledge of CSS. Up until now, I haven't paid much attention to talk about doctypes and validating HTML. But, some say that both those things are important. So, I have a few questions. (I tried reading some related posts on this site, but they all seemed fragmentary to me inasmuch that I don't know the first thing about any of this! :) )

First, what is validating HTML, how do you do it, and what are the benefits to validating? And, what are the hazards of using un-validated HTML?

Second, can somebody please provide a basic, beginner-level description of doctypes, how they're used, and the advantages to using them?

I have a feeling that I'm not losing much right now by not using doctypes or by not validating, but I want to keep up with the times . . .

Thanks,

Matthew

drbrain

11:47 pm on Jul 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First, a little history.

Back in the olden days this guy thought up a great way of writing structured documents, SGML. The problem with SGML is that it is difficult to write by hand. First you had to write this DTD thingy, then you need to write your document, then you would validate the document against the DTD. Not to mention the myriad of shortcut ways of writing SGML.

Then this other guy (I really should go look up names) decided to create HTML, which is an application of SGML. (SGML is to HTML as XML is to XHTML.) When they wrote a "web browser", it had to be very forgiving, because they wanted people to use it, rather than fight it, like they did with SGML.

(This is a broad simplification of everything.) Fast forward several years and all this forgiveness by browsers has led to the standards mess we have today, despite having DTDs for HTML1, 2, 3.2, 4.01.

Validating your document is like having a second pair of eyes, or the spell-checker of HTML. It lets you spot possible errors in your markup.

Moving into the future, if we ever start writing XML, (or XHTML served with an XML mimetype) you'll have to have a valid document or the browser won't render it.

You validate your document by going to [validator.w3.org...] and filling in the URI field and hitting submit.

The best reason to validate is to find potential errors in your code, and because its a good habit. Kind of like washing your hands after using the bathroom, even if nobody else is in there to see you not do it.

A doctype and a DTD are related beasts. The doctype comes from the SGML world, and tells you what kind of document this is supposed to be. The DTD says what elements of the document are allowed where and in what order.

For HTML4, there are three doctypes, frameset, transitional, and strict.

Strict is used when you comply completely to HTML4, and don't use any deprecated attributes or elements.

Transitional is when you use deprecated attributes or elements (like center, align=).

Frameset is for when you have a frameset page.

This is the HTML4.01 Strict doctype:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

When you provide the URL, you are saying "I guarantee that my document won't give an error if compared against the DTD at the URL provided." Browsers don't typically check this, that's what the validator is for.

What you do gain by providing a doctype with URL in modern browsers is called 'Doctype Switching'. When a doctype with URL is provided, IE6, Gecko-based browsers (Moz, NS, etc.), and a few others (Opera and Safari, I think) will switch to 'standards-mode' rendering. IE5.5 used an incompatible box model for CSS width and height. In IE6, you will get the standards-mode box model with a doctype and URL. CSS properties also inherit correctly with a doctype, for example the font-size doesn't reset for a <table>.

PS: I've glossed over a bunch of stuff, there are exceptions to what I've written, but I didn't want to get bogged down in too many details

MatthewHSE

12:16 am on Jul 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks drbrain. Your post helped out a lot. So I went off to the validator, and got this error message:

====================
I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to validate the document. The sources I tried are:

The HTTP Content-Type field.
The XML Declaration.
The HTML "META" element.
And I even tried to autodetect it using the algorithm defined in Appendix F of the XML 1.0 Recommendation.

Since none of these sources yielded any usable information, I will not be able to validate this document. Sorry. Please make sure you specify the character encoding in use.

IANA maintains the list of official names for character sets.
====================

What in the world is it talking about!? [grin]

(Thanks for more help, anyone . . . How badly does it show that I'm in over my head? Never mind, don't answer that one! :) )

Mohamed_E

12:36 am on Jul 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to the wonderful world of validation, MatthewHSE!

There are a huge number of character sets out there, think of English vs Japanese. Browsers try to guess which character set you are using; validators expect you to tell them explicitly.

There are two ways of telling the world (web browsers, validators, everyone else) what character encoding you are using:

1. If you can use an .htaccess file put the following line in it:

AddType 'text/html; charset=ISO-8859-1' html

2. Otherwise put the following line in the <head> section of every file:

<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

assuming you are using one very common encoding.

> How badly does it show that I'm in over my head?

Almost all of us are, or were at some stage, way over our heads :)

g1smd

10:46 pm on Jul 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I visit the validator using [validator.w3.org...] as it allows you to specify the DOCTYPE and Character Set using drop down menu options if you forgot to include them in the actual HTML document.

For most people "HTML 4.01 Transitional" is probably the version of HTML to aim for.

I also tick the boxes for "Show Source", "Show Outline", "Show Parse Tree" and "Verbose Output" as well.

On the design side, make your code tidier by exporting all CSS and JS to external files as well.

MatthewHSE

9:13 pm on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi folks,

I'm trying to validate using HTML 4.01 Strict. My first page had 35 errors. Most of them I can fix myself, but I think there are two that I need help with.

First, I got this error before it even began to show errors with my page:

DOCTYPE Override in effect! Any DOCTYPE Declaration in the document has been suppressed and the DOCTYPE for «HTML 4.01 Strict» inserted instead. The document will not be Valid until you alter the source file to reflect this new DOCTYPE.

What does this mean, and what should I do about it?

Second, the validator is complaining about the nowrap in some of my td tags. I need that attribute, though. What CSS can I use to replace the nowrap attribute?

Thanks,

Matthew

Michaeldd

9:33 pm on Jul 28, 2003 (gmt 0)

10+ Year Member



I initially had trouble getting started with validators for the same reason. There's a vailidator at [htmlhelp.com...] that "presumes" the declaration, then...follow the steps to investigate your particular needs. Hope this helps a little.

lorax

12:27 pm on Jul 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> SGML is to HTML as XML is to XHTML

I don't think that's quite right. SGML is the mother of all browser MLs. As I understand it, XML, XHTML, and HTML are all simplified versions of SGML and rely on SGML-DTDs for their existence.

Back to the question. Your pages may work fine now but as the browsers become more aligned with standards it will become more imperative that you declare your DOCTYPE so they can render it accurately. Search engines will also rely more heavily on the DOCTYPE (my own theory).

grahamstewart

1:48 pm on Jul 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



MatthewHSE:
I got this error before it even began to show errors with my page

Thats a warning because you are specifying a doctype for the validator. To get rid of it just include a doctype in your html instead.

the validator is complaining about the nowrap in some of my td tags. I need that attribute, though. What CSS can I use to replace the nowrap attribute?

Something like..


td.nowrap {
white-space: nowrap;
}

stever

2:22 pm on Jul 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you are getting interested in validation now, it may be worth your while trying to validate this and future projects to xhtml.

There are really only a few differences between xhtml and correct html (closing all tags, tag order, etc) and it may make sense to learn this newer version while you are studying this.

Personally I try to get all new projects to validate xhtml 1.0 transitional (and I have to say DW is doing a great job with this - the two latest projects have just validated without error).