Video: Why Doesn't Google Validate?

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Video: Why Doesn't Google Validate?

pageoneresults

11:48 pm on Sep 16, 2009 (gmt 0)

Why doesn't google.com validate?
[YouTube.com...]

We don't give any sort of boost to web pages that validate. The vast majority of pages on the web don't validate.

I'm so happy that video came out. :)

tedster

4:15 am on Sep 17, 2009 (gmt 0)

Interesting comments - the primary goal for Google's code is compatibility across many devices. But they are open to any staff member who wants to take on a validation project - in their 20% time, though, not as their official job.

pageoneresults

3:01 pm on Sep 17, 2009 (gmt 0)

Everyone should keep in mind that this video is about Why Google Doesn't Validate and not about Why You Shouldn't Validate!

I woke up this morning to 15+ emails pertaining to this video, thanks Google. I think you're going to assist me in proving some points in the very near future.

I wrote an extensive article on 2009 September 08 on why I thought Google didn't validate. I read my article, listened to Matts video and it's almost as if it were the other way around. I mean, my article mimics what Matt says almost word for word. ;

I've watched the video multiple times and have transcribed it word for word so those who use it as a crutch can't get away with what they are about to do. That video is being shared amongst developers like candy right now. It's going to be the one thing they present before being handed their pink slip for failure to follow protocol and breaking the site. Again, don't even think about using this video as an excuse not to validate your own documents.

Typically we've been a little more willing to say things like "oh, we don't need that double quote or something like that" or we'll specify a color in a way that doesn't validate.

I've been coding now for over 10 years. For the life of me, I cannot find any reasons to specify a color in a way that doesn't validate for the web. I'm very familiar with the old school HTML syntax required for email marketing and such. One of these days they'll catch up.

The article I wrote on the 8th, takes Google's home page and dissects it byte for byte, well, almost. Since I don't know the intricacies of the dynamics, I can only guess at the solutions to existing malformed and invalid syntax. I looked at every single error and warning. I documented those and provided the fix for each one that was visible to me.

The very first thing that Matt states in the video is this...

Google looks at the number of bytes that we actually return to users and we want that to be as small as possible because every byte matters when you are serving up hundreds of millions of search requests to users.

I'm on board with that. I can just imagine the volume of bytes being served by Google. Here's the part that confuses me. If you are saving bytes at this level, what exactly happens when the UA has to process that invalid code?

Idiosyncratic browsers. Worried about compatibility vs validation.

I'm also on board with this. We all work hard to make sure our websites display properly across a variety of platforms and devices. I still don't see how that justifies some of the 1990s code practices at play here. You'd have to peel apart Google's home page to fully understand the extent of the coding syntax and what they are doing.

For example, the first 7 errors reported for Google are from the <body> Element...

<body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 topmargin=3 marginheight=3>

I remember having to include those attributes way back in the IE vs Netscape days. IE had their margin attributes and Netscape had theirs. That was during the old school browser wars.

If you look at the CSS Google is using and then look at the invalid markup they are using, it makes you wonder WHY they can't clean up some of this stuff. I fully understand the bytes issue. But the above can easily be handled via CSS with less bytes.

The vast majority of pages on the web don't validate.

That statement might have been true 5 years ago. That is no longer the case in many industries. Things are starting to shape up. Out of the 520 website home pages we are now monitoring, 13% or 68 of them are valid. If I would have performed this exercise 5 years ago, that number might be more like 3%. So, that above statement may still be somewhat valid but I don't think it is a viable excuse for not writing valid markup and following protocol.

We have to crawl, index and return results on the web. Even if pages don't validate.

That's the bottom line. And, we all know that the bottom line is usually the overriding factor. After building a crawler over the past few years, I fully understand the procedures involved with crawling and indexing. The number of error handling routines we have in place is pretty involved. And you know what? During our crawling and comparisons, sites with invalid code require a little more processing time on our end due to the error handling routines.

So, is Google stating that they don't care about valid code? That Webmasters can code however they wish and that the almighty Google/Googlebot will figure it out while crawling? Here's a question and one that will probably never be answered.

If I have two identical sites (twins) and one is valid, the other is invalid, which site performs better overall? That question will never be answered because in real life, it doesn't happen. There is no way to perform testing at this level. I've thought about it for years and it just isn't going to happen.

That's okay. I have many individuals right now who are interested in cleaning up their markup errors. In fact, I've been providing a few free consultations to various folks and assisting them in getting things in order. We have one person who has documented everything and will be monitoring through the rest of the year. Not to mention many others who have cleaned up and learned quite a bit in the process. ;)

I'm happy that Matt released this video as it explains why Google doesn't validate. Nowhere, and I do mean nowhere in that video does Matt explain why your site doesn't or shouldn't validate!

buckworks

3:31 pm on Sep 17, 2009 (gmt 0)

One of the comments on that video said this:

"real world tests show the things that go into clean, valid and accessible code DO help websites do better in SEO."

I am convinced of that too. Validation might not be assessed directly but a developer who strives to write valid code will likely do other things better than average which do make a difference to one's SEO results. It's all part of the mindset.

For me personally, some of the most important thoughts I ever read about the mind that is committed to valid code are Claus's comments in this thread:

[webmasterworld.com...]

That said, "perfect" validation is not always realistic, but "better" validation usually is.

FranticFish

4:10 pm on Sep 17, 2009 (gmt 0)

I'm no expert on browser speed / performance, but how many browsers DON'T support CSS? If the CSS file makes things a little slower the first time, how many people are return visitors? Surely after the first visit (even assuming that is slower) it's a win-win sitution. Google change their look/layout next to never so how often would the CSS file change?

And as far as compatibility goes, my experience is that vaid code is MORE, not less cross-browser compatible.