HTML Validation?

Forum Moderators: open

Message Too Old, No Replies

HTML Validation?

How important is it.

coosblues

3:44 am on Oct 24, 2002 (gmt 0)

Just wondering how or if the tags are looked at by the search engines. I'm sure i've got some bad tags here and there but the page looks and works fine. Am I going to get penalized for something like omitting a tag like </b> etc. Thanks

andreasfriedrich

10:09 pm on Nov 12, 2002 (gmt 0)

I didn't realise search engines cared about the HTML code as well

SEs don´t index the HTML tags. But they need to download your whole page and parse/separate content from markup. If there are errors in your markup this separation process might not work.

They need to parse the HTML code to know what´s in h[1-6] elements, p elements, etc.

If a SE stops indexing after 100k then you would better have lots of content in those 100k than lots of HTML code.

Andreas

g1smd

11:02 pm on Nov 12, 2002 (gmt 0)

>> I didn't realise search engines cared about the HTML code as well <<

They do not index the code, but need clean code so that they can find your content within it. They also need to be able to find <Hx> tags, as well as title and alt attributes within tags, and so on.

So this <H1 My Really Cool Title</H1> might be rendered correctly in IE, but a search engine would probably miss the very words that you wanted it find [Spot the missing ">" on the first tag]. Only a code validator would spot that type of error.

Likewise <a href=acoolpage.htm> might be missed, as attributes should be Quoted. It should be:
<a href="acoolpage.htm"> for example.

These are some of the reasons that you should validate your code [validator.w3.org]. I don't care if your code doesn't validate completely. You should use the validator to remove typos, nesting errors, and such like, but if you want to use a few IE or NN specific tags then I don't generally have a problem with that. Use a validator to ensure that your code is well formed, has logical structure, is nested correctly, and has no tags with spelling errors, and so on. However careful you are, you'll be surprised at how many silly errors a validator will find in your web site.

While you are checking your HTML code in a validator, don't forget to validate your CSS as well [jigsaw.w3.org].

There are five levels of HTML validation:

Find broken tags like <H1 or <a href="blarg.htm" Link/A>. These trip browsers and SE alike.

Find tags spelt wrongly like <IMGG="nowinvisible.png">. These trip browsers (and SE only on links).

Find wrongly nested tags <p><font="blarg"><b><i>Blarg</font><i><p></b>. These trip browsers.

Find tags that are not in the specification (i.e. IE or NN specific tags like <MARQUEE>). These work in specific browsers, and are ignored in all others.

Find tags that are deprecated. These are important only if you want total validation of your site.

Items 1, 2, and 3 are vital to fix.
Item 4 is at your discretion, and will probably need fixing when a new browser becomes widely used, which no longer supports the non-standard tags you have used.
Item 5 is only important if you want to code to the W3C standards and stick their logo on your page.

Anyone who routinely ignores items 1, 2 or 3 (and maybe item 4 as well) isn't entitled to call themselves a "web professional".

[edited by: g1smd at 1:40 am (utc) on Nov. 13, 2002]

jaybee

1:39 am on Nov 13, 2002 (gmt 0)

Hi all.

Just want to give some feed back about the subject on proper parsing.
Firstly I've been dabbling in HTML for five years now and really do think it is a nice field to be involved with. No, I am not a pro, and I have learnt all I know because of my curiosity about HTML.

I'd say at least three years ago I stumbled upon a site where I used to check my pages for proper coding. When I first looked at the parsing and saw the errors I was very down-hearted. In the beginning I couldn't grasp the meaning of the errors, but said to myself "this I will learn".

I kept making changes here and there and begun to understand how the whole HTML worked. I felt good about this because it was quite a challenge, one which I mastered.

I never took a course on HTML, but learnt from looking at other pages and tried to figure how things worked. I did use a wysiwyg program for a short time but found out it was no good. I decided to stick with hard coding.

Later I found out about the W3C and have been a faith-full believer in the system, why shouldn't I, they are the leaders at least I think so. Not IE or NN. I learnt CSS also from the internet and now when I validate both HTML and CSS I am usually right on. I have gone on to XHTML now and like how everything just flows.

There is one subject this thread don't seem to address and that is BOBBY. That I will have to get into because it too, one day will become standard.

I don't mean too be breathy about this, but, it is very important for all of us to adhere to the standards set forth by the W3C. This in the end will make the internet much more free flowing.

Lastly if you don't have one download a validator and check before uploading your pages, then have the W3C validator do so again after uploading them. You'll feel good that you are doing the right thing.
jaybee

victor

7:07 pm on Nov 13, 2002 (gmt 0)

It seems to me you can take three approaches to validation.

the purist
aims for a standard, and writes code that adheres 100%. A really puritan purist goes for a strict!doctype too.

This is probably overkill in most commercial settings

the pragmatist
wants clean code but will knowingly use the occasional deprecated element -- provided they've tested it on all versions and platforms of all mainstream browsers and it works, there is no need to do more.

Sometimes life is too short to do everything. But the pragmatist has a clear business case for the deviations from 100% validated code.

the praying
hacks together some code, tries it out on their own machine, and prays that it will work in all environments, and that it is future-proof against new mainstream browsers.

Basically they are praying that their understanding of how badly-formed HTML should be parse is an immutable industry standard.

That's too much faith in dodgy software companies for me. That's why I prefer to support the efforts of W3C.

Romeo

7:51 pm on Nov 13, 2002 (gmt 0)

I am not sure, if a professional attitude can be called being puristic.
Some web page authors do spelling checks on their content, and some others additionally run their pages through a validator.
I have seen big commercial web sites worth a development project budget of several 100000 dollars done by 'professional' web-design companies, but horribly failing a W3 check.
This is embarrassing and non-professional.

Regards,
R.

Please no comments on my English, I am not a native speaker.

pageoneresults

8:29 pm on Nov 13, 2002 (gmt 0)

> Please no comments on my English, I am not a native speaker.

Romeo, we would never do that. This is a global community and English just happens to be the primary language spoken here. Most of us can read between the lines and understand what you are saying. If not, I'm sure someone will ask a question to verify if they did not understand.

P.S. There are only a small number of large commercial sites that would pass validation. Since valid html is becoming more public knowledge, maybe those companies will rethink their strategies and work towards validation at some point in the future.

I know I wouldn't want to be the one sitting there with a $100,000 website that cannot be indexed properly because the browsers and spiders are changing the rules and wanting valid code. ;)

> This is embarrassing and non-professional.

Very!

[edited by: pageoneresults at 8:32 pm (utc) on Nov. 13, 2002]

richlowe

8:32 pm on Nov 13, 2002 (gmt 0)

Is HTML validation important?

No. Not in the least.

Is HTML validation desirable?

Yes, it is.

Do visitors care?

99.9% of visitors will never look and never care if your site validates or not as long as it looks good in THEIR browser.

Do search engines care?

Except for actual errors, no. Illustrated by the fact that .txt pages index just as well as .HTML in many instances (excluding effects of H? tags and such).

Should they care?

No, they should not. Search engines exist to help searchers find content. Valid code has nothing to do with content. In fact, .txt files SHOULD index just as well as perfectly validated HTML.

Well, then, who does care?

Computer geeks (like me). ;-)

Richard Lowe

tedster

9:13 pm on Nov 13, 2002 (gmt 0)

Understanding what valid HTML is can be a really important step. For instance, I read a lot of HTML tutorials before I came across the ESSENTIAL concept of inline tags and block level tags.

Shoot, I was probably two years into my web design career before I really understood what was meant by a "markup language".

So when I see a site with a multi-million dollar budget that's nesting block level tags inside inline tags - or leaving divs and td's unclosed, well, I can't help but feel that's just WRONG --- whether Explorer renders the page or not.

joeblakesley

11:17 pm on Nov 13, 2002 (gmt 0)

Well, then, who does care?
Computer geeks (like me). ;-)

Me too... no seriously...I just thought I'd stress the point that as well as hackers (or geeks) not liking sites with bad code, validation is the only way to make sure your site works properly and can be understood by the variety of browsers (widely-used and less-widely used - past, present and future) (and plethora of other tools that may use your page (e.g.: S.E.s' tools)).

(How can you expect any tool to parse a document which uses things that are not in the rules? How are the client writers supposed to know what rules you made up for yourself in relation to HTML?)

Using valid HTML makes pages easier (/possible) for the disabled to view, makes them easier(/possible) to update or edit, and helps in loads of other ways (including SEO).

Most importantly, there is the slippery-slope argument and the related argument that many people do not know what HTML will work or not work in common browsers (or even the one browser they use) let alone in the plethora of past, present and future clients/tools that may need to parse their code. Is it not quicker and easier to learn the rules (or use a standards-compliant editor) and know that your page will display right. As has been said elsewhere, you are otherwise just putting it together more-or-less randomly and "praying" that your HTML might work.

Joe

pageoneresults

12:18 am on Nov 14, 2002 (gmt 0)

I'm going to throw another twist into this...

Just because you can get a site to validate does not mean that it will display properly across all browsers. We've all been talking about HTML and working with CSS. Now the second part of the equation comes into play, validating your CSS!

I remember when I got my first site to validate. I was totally jazzed. Then I went on a mission to view it in as many different browsers as I could. I was able to cover almost all currently used versions on both PC and Mac. The differences were somewhat shocking.

Now it was time to hunker down and learn CSS. If you have a simple site, the basics work fine. Once you get into the more complex layouts, you need to know about CSS. Without the use of CSS, you will not be able to validate 100% if you want certain things to happen like "0" margins and background properties and all sorts of other stuff.

HTML Validation is just one part of the entire equation. You should take great pride in a site if you've been able to pass W3C muster on both your html and css. You are now in a small group of developers who are working towards standards.

I've also found that this has become a prime selling tool in development. Offering the client valid html and the use of css puts you a step ahead of the rest of the pack. During the process you are able to help educate the consumer of the benefits that are achieved with your valid html and css. When they see that you can change the color of their links across the entire site in less than a couple of seconds; when they see that you can apply a background image to all of their pages in less than a minute; you'll have won them over!

g1smd

2:21 am on Nov 14, 2002 (gmt 0)

Umm, validating CSS is even more important than validating HTML. Incomplete CSS (like leaving the units off of measurements, or stating a link colour withour stating the background colour) can lead to all sorts of problems. The tool at [jigsaw.w3.org...] is an essential visit.

Here is another thing. Using CSS relies on having well-formed and valid HTML. [And, I will say it again; I'm not bothered about the sort of HTML that includes a few propietory tags like MARGINHEIGHT, BLINK, MARQUEE, or BGPROPERTIES; I'm talking about correcting code that has tag spelling typos, unquoted attributes, tag nesting errors, unclosed tag pairs, and so on].

gibbergibber

2:43 am on Nov 14, 2002 (gmt 0)

-- So this <H1 My Really Cool Title</H1> might be rendered correctly in IE, but a search engine would probably miss the very words that you wanted it find. Only a code validator would spot that type of error. --

Well, I wasn't really defending people who use HTML incorrectly. I agree, that kind of validation is essential, but I'm talking about older code being declared invalid even when it's been used correctly.

In a nutshell:

Is it really necessary to say that "<center>Centred Text</center>" is invalid? Why? What harm does it do to anyone? Search engines don't care, visitors don't, and it renders fine in all browsers as far as I'm aware.

I know they're declared "Deprecated" rather than obsolete, but this does mean they will become obsolete in the future and in theory not supported by future browsers. Why on earth should this happen?

Why should you reduce the number of sites a browser can view? It can't be about bandwidth (text takes up a minute amount of space compared to pictures, sound and video) and it can't put too much strain on browsers either.

I guess all this discussion is irrelevant as what'll happen is that loads of people will continue using any old tags that work with IE and Netscape.

I have to admit though, "Bobby" accessibility is a very serious point in favour of validation, and perhaps that alone would make compliance desirable even if a site doesn't need to be compliant to work on any standard browser. Okay, on that point alone I'm sold 100%.

delstar dotstar

11:04 am on Nov 14, 2002 (gmt 0)

Whoa nelly. Just as Word XP can still read files created in word processors from the days of DOS, browsers will continue to support old versions of HTML. I don't think it's something you'll have to worry about for a good long time.

I guess all this discussion is irrelevant as what'll happen is that loads of people will continue using any old tags that work with IE and Netscape.

I hear you. People will keep on using old tags, but like anything else that's obsolete, fewer and fewer will as time goes on. In the end, it gets to the point where people think you're weird for still using them.

jaybee

4:13 pm on Nov 14, 2002 (gmt 0)

I suppose some of us would rather fight than change. Unfortunately life is progressive in nature.

If you look at the early Model T, or the Kitty Hawk's creation, I'm sure if they were in excellent mechanical condition, that would make them both functional. Right?.

Well ask yourself where are they? Perhaps in a museum somewhere in the world. Even if there are duplications of these machinery, I certainly don't see them in use. Not necessarily solely because of their monetary value, because they are antiques, but they would not fit in properly with new standards and codes and all the legal requirements.

No we shouldn't look at them as old, but rather be thank-full for, where they have brought us, and what the future models holds in store, because of being firsts in their respectfull industry.

The same will happen to HTML, CSS, XHTML and all the new developments now on the boards for future recommendation. All things will share their moment of glory then finally fade into history, only to remind us where we were years before.

If you wish to remain behind it's a choice, each of us will have to make. Undoubtedly one will feel left behind if they miss the boat.

Sure changes to the internet will not happen overnight, but eventually they will fade out at some point in time. Don't think for one moment the major players aren't aware of this. The thing is, if browsers can't parse properly what good is it to the user. You think the big boys wont have to change?.

The almighty dollar will force all of them to comply with the standards. Oh, they do have input as far as new ideas goes, but in the end, it's still up to W3C (which is the governing body of the WWW) to accept and implement those which is agreed on, into the overall scheme on how the internet evolve. In the meantime IE and NN are putting the cart in front of the horse, because of the new gimmicks they develop and work in their browsers, Why? because of the competition for those little green one$. :(

Keep on keeping on, fellow websters.
jaybee

This 44 message thread spans 2 pages: 44

HTML Validation?

How important is it.

coosblues

andreasfriedrich

g1smd

jaybee

victor

Romeo

pageoneresults

richlowe

tedster

joeblakesley

pageoneresults

g1smd

gibbergibber

delstar dotstar

jaybee

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week