Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: incrediBILL
Just for interest and consistency, Brett: what service/application did you use to do the validations?
I checked a few with CSE HTML Validator, and got results consistent with yours, though not identical.
The HTML quality control on some of these sites is mind-numbingly stupid. And they can't blame all of the errors on paring down for speed: there are things like stray close tags with no preceding open. D'oh!
I'm not even going to comment on the number of errors returned by M$N - that would be just too easy!
On the whole, regarding your results, and the question of the possiblity of search engines rewarding pages with validating code higher rankings [webmasterworld.com] over similar content pages fraught with errors: I wonder if it's kinda like havin' a 350lb., deep-fried-food eatin', three packs of cigarettes a day Cardiologist scolding patients for not taking better care of themselves?
I wouldn't accept pages like these from any team member I work with.
There has always been a segment of Web developers who insist on writing valid code; the numbers appear to be increasing. I suspect with the growing awareness of Web Standards, and accessibility issues finally coming into the forefront, that at some point, even the "big boys" will heed the call. That WOULD set a great example...
I'm a DMOZ editor, and it is not part of my brief.
If a site doesn't display in my usual working environment (Windows/Opera, all plug-ins and Java disabled), I will go back and take another look (I have Mozilla, IE, Amaya and Lynx to try).
But that other look may not be for a week or two. So at the very least, the submission gets delayed.
I've never excluded a site for rendering too poorly to be of use. But I wouldn't rule it out. If it can't be read, it can hardly be useful content.
No, ODP editors do not take code validation into consideration in a site unless it hinders navigation to the point where it is impossible. I've run into sites poorly coded that I couldn't find the navigational links so I couldn't review the site, so I couldn't accept it. :(
Also, I rarely edit with Flash and Java turned on, simply because of my only internet connection choice, 24k dialup. In that case, I leave the site in unreviewed for another editor to look at, which may delay the inclusion, but it isn't rejected.
The MSN team must have been all "designers" and no "coders". :)
(Actually, now that I take a closer look, maybe it was all management and no designers or coders.)
Its worth noting that the chance of them making an error is between 1 and 0.05%, and that DMOZ has code that is around 20 TIMES more compliant.
Also for the SERP's table to the right...we might also want to take into account the number of SERP's displayed....due to some errors maybe getting repeated in the SERP?
As Tedster said, there is one in the first group that doesn't even work with some browsers. That's just sad.
The one big improvement and surprise was Altavista. A couple years ago, they had close to 1000 errors on their serps.
I know, many of them will argue that they don't go for strict w3c validation because of bandwidth concerns. However, if you take time to look at some of the serp code, you'll find a great deal of gratuitous code. They seem to be able to afford bandwidth for js rollovers, exits, and other bells - they can certainly afford the bandwidth for valid code.
So what is it? Why can't they take the time to validate?
If you present valid code, you get assured browser compatibility regardless of the browser specifics. All browser developers test their browsers with valid code from the w3c core tests. If you can produce validated code, then no matter what browser connects, you have a higher chance of putting something on their screen they can read.
If you don't validate the code, you could be witting off 1-5% of your visitors who use nonstandard browsers. Those 1-5% can pay for a great deal of bandwidth.
I recently listened to a long (75 min) presentation from a search engine engineer in charge of large data centers for a major search engine. During that discussion, he stressed over and over KISS: Keep It Simple Stupid, and redundancy redundancy redundancy. They want to keep the system as simple as possible and always have at least two paths for data for when something fails.
Furthermore, that same search engine operates on open source tools, has contributed open source code, and has made many remarks about the number of PHD's per square foot of office space.
I feel that, presenting a page that wouldn't pass a 7th grade HTML class pop quiz, severely damages their tech credentials.
No errors found!
This document would validate as the document type specified if you updated it to match the Options used.
well i just checked the html 4.01 validation on google homepage, and the errors are minor. although there are pieces of code that do not validate according to w3c specs, will these actually cause any problems to users?
i don't have time to go through all the errors displayed, so i'll pick just one - quotes around attribute values. for example, google uses <body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onLoad=sf()>, yet w3c specs say they must use quotes around the attribue values, ie bgcolor="#ffffff".
how many browsers will reject this non-validated code just because of the lack of quotes? can anyone name just one browser that will choke on this code?
even if a browser cannot cope with something like bgcolor=#ffffff because of the lack of quotes, then it will simply default to it's normal background color. because google has been consistent in it's failure to use quotes, the same browsers will also use default colors for text, links and so on. therefore, users of poor browsers will still be able to view the page, just not in google's desired colors.
so, in order to provide validated content on just this one small piece of code, google would need to use quotes around not one, but 5 attributes. that makes 10 extra quotes (characters) per page view.
according to the google zeitgeist for 2001 (http://www.google.com/press/zeitgeist2001.html ), there were more than 150 million queries per day. you don't need a PhD to work out that for google to validate just that one small piece of code, they would need to deliver 1500 million extra characters of information per day. how many gigabytes is that? i've kinda run out of fingers ...
bear in mind that this is just for the one piece of code. try a search that produces 10 results. how many missing quotes around attributes in that page? multiply that by 150 million, work out how many gigabytes of information that becomes and add it to the above.
if you can find out how much bandwidth costs in bulk, you can get a pretty good idea how much they save by stripping out a few irrelevant characters. don't forget that delivering less content also speeds up delivery of content - sure, a few quotes won't make much on one page, but on 150 million page views it's got to make one hell of a difference.
brett, i reckon some people deserve far more credit than you give them.
>non-validated code just because of the lack of quotes
No one knows. A problem could be caused by combined or accumulated minor errors (see: MSN.com).
Either way, that's the not the point. The core of it is if some of the top tech centers on the internet can't follow the agreed upon standards, then they deserve no quarter and no respect for their technology. Their tech credentials are tainted and their credibility stretched.
> how many gigabytes of information that becomes and
> add it to the above.
They have several hundred bytes of trival deletable code on their serps. If it were a bandwidth issue, they wouldn't be using that code in the first place. Gratuitous css and js for what appears in the browser to be html 3.2? Not to mention the "cached" page distribution. Obviously, bandwidth isn't the issue. The only thing I can conclude, is it is either flat out laziness or disrespect for the agreed upon standards.
As Brett notes, the presence of gratuitous code negates the "bandwidth excuse." Not offering valid code make these sites look bad and definately does bring into question doubts about their tech credentials.
I wonder who among the group will be the first to offer valid code at some point? That would carry a lot of PR value - as in PUBLIC RELATIONS, if their marketing people were savvy enough to capitalize on the inferred credibility.
validation for what purpose? validation for the purpose of validating? validation for the purpose of gaining credibility for creating validated code?
>>gratuitous css and js
>>who among the group will be the first to offer
>>valid code at some point? That would carry a
>>lot of PR value - as in PUBLIC RELATIONS
i doubt it. how many people actually know that google code doesn't validate to some w3c standard? how many people know what w3c is? how many people care?
the answer is, that outside the world of web experts, very few people know or understand things like this, and even fewer care. people go to google to search for something, to get relevant results quickly and efficiently. they get what they want then go away. thats it. thats all that matters.
(It's quoted in several RFCs).This ethos is part of why it is so easy for widely differing hosts and servers and agents and browsers and everything else scrabbling for a living on the Internet to intercommunicate.
But its taking unfair advantage to interpret it as:
One analogy is grammar and spelling. We all communicate better if we attempt to write our best, while not fussing too much over other's lapses.
bBUT. THaht dontmean u shld rite anywitchwatys juss cos im not cared much usually n.e.ways. :)
The effort to understand does not belong in the browser, even though browsers do sterling work here. The effort to communicate clearly according to agreed standards belongs squarely with the sites' producers.
The core of it is if some of the top tech centers on the internet can't follow the agreed upon standards, then they deserve no quarter and no respect for their technology. Their tech credentials are tainted and their credibility stretched.
Brett, please correct me if I got it wrong. I suppose you are talking about Google, without doubt the finest search engine as confirmed many times in these forums, and you believe that they deserve no respect for their technology just because "they can't follow the agreed upon standards"?!
They apparently don't use any quotes around color codes and other HTML attributes. So what? They seem to do this knowingly, to reduce the page size.
If you don't validate the code, you could be witting off 1-5% of your visitors who use nonstandard browsers. Those 1-5% can pay for a great deal of bandwidth.
Well, I am sure any modern browser can render the Google web site right. And there is a very good chance that even archaic browsers like Netscape 1.x can do it right. Because Google's code is simple and elegant. Just take a closer look.
As for other web sites, it all depends on the type of your audience. If your core audience believes that code validation is the most important thing in a web site then by all means make sure that your code validates. Otherwise I don't see any good reason to check every single quote. If it looks good in IE, Netscape and Opera (99% of all users) then why bother? If the rest 1% refuses to use a real browser, then it's their problem.
The web has been struggling for years to develop a set of standards that everyone should follow so that we can clean the mess up that has developed over the years, and there is a mess!
It would only be appropriate for the webs largest properties to validate their websites against the W3C standards and help educate the public on the benefits of validation.
What if Google decides to make validation part of the algo. I'll bet everyone's tune changes then. I figure I'd stay one step ahead of everyone else and validate now before its too late!
Its all about writing clean html, xhtml, css, js, etc... If it weren't for IE, we'd all be writing valid code. I've said this before and I'll say it again, NN4.x is one of the best website validators out there. If you can get it to work in NN4.x, it will work everywhere else!
And you know what, it feels really good to be able to advertise that you've validated. So what if many don't know about it. They will after you validate and advertise it! Let the voice be heard. Man, I sound like some sort of W3C groupie, huh?
The Internets major properties could change the web overnight if they started promoting validation! Think about it, if Google posted a W3C icon on their home page tomorrow, this community would go over the edge. There would be topics all over the place discussing validation. There would be a lot of sleepless nights for many while they worked round the clock to clean up their acts!
Either work towards validation now, or do it later when it may be too late. Just think, you'll be in the top 10% of developers/designers who are leading the way into the future. Someone's gotta lead and it might as well be us. I hate following, don't you?
P.S. We know you are following the thread...
leading the way into the future, or desperately clinging to a utopian fantasy ?
w3c standards are developed way too late - technology developers and browser authors have moved on a long long way by the time w3c create a standard for any new technology.
standards are set not by w3c, but by the technology and browser authors. w3c then come along and set their own "official" standards. new browser versions are then modified to incorporate the new "official" standards, but support for the original "unofficial" standards remains. browsers must continue to support old, non-w3c standards compliant code as to reject it would render the vast majority of the web useless.
until w3c get ahead of the game and work with developers and browser authors to create the standards together, w3c standards are worthless.
If I'm not mistaken, I thought the W3C was the authority on web standards, not the browser authors.
> w3c then come along and set their own "official" standards.
I thought the W3C set the standards first and then the browser authors and web developers decided that it was easier to code this way because it was quicker for them and less time consuming to try and validate against the standards.
> new browser versions are then modified to incorporate the new "official" standards, but support for the original "unofficial" standards remains.
Aren't they usually modified after the fact when they find out that what they produced is not functioning the way it should across multiple platforms.
> browsers must continue to support old, non-w3c standards compliant code as to reject it would render the vast majority of the web useless.
That's because the W3C could never gain a solid footing with the standards. Non compliant code is rampant on the web and to try and clean up now may be too late. But, if the major players in the industry decide that it is now time to follow standards, what do we do?
Its unfortunate but I think the web is too far gone to establish a solid set of standards that are followed by all. Its a hodge podge of invalid code out there and people will just continue to justify their invalid code by saying they are producing what works. If that keeps up, then we will just keep seeing browsers that display web pages differently based on the standards they are following. Then all of us here discussing this will continue to produce hacks that make up for the non-compliant code that is being spewed forth by those justifying the non-compliant code works, 95% of the time! ;)
Crazy_Fool, no personal attack intended. I became a W3C groupie a couple of years ago and have since then, seen the light. Compliancy is going to become a major issue in the very near future. I just think we need to prepare ourselves for that shift and now is the best time!
Should Linus' homepage be bumped down serps or should he drop all the other *unimportant* stuff he does and learn correct html?
Weinberger goes on to relate the following joke:
A man goes to a doctor. "Doc, it hurts when I go like this," he says, poking himself gently in the foot with his index finger. "It hurts when I go like this," he says, poking his knee. "It hurts when I go like this," he says as he pokes his thigh. He proceeds the same way up to the top of his head.
"I see," says the doctor. "You've got a broken finger." :)
I thought the W3C set the standards first
Very funny. In html 3.2, the w3c added "widely deployed features such as tables, applets, text flow around images, superscripts and subscripts"
What is so difficult about writing valid html?
For those of us who understand valid html/know there is such a thing as valid html there is nothing difficult/clever about it.
For the other 99.9percent of the world population... well that's another matter :)
I have no plans for moving to xhtml until there is a valid reason to do so. The only reason at the moment is to allow other folks to rip off your pages real easy... how often do you need to programatically access the content of Google's homepage?