W3C Validation and Google SEO

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

W3C Validation and Google SEO

from the "Mythbusters" thread

alfawolf7

1:36 am on Oct 30, 2006 (gmt 0)

Slow down!

There is something unclear: what is W3C valid?

There is .css markup validation, there are 5 to 7 different types of (XHTML - HTML) markup options and then things that are not checked as they are either in server side or simply not checked.

Further validation might include various levels of user access.

The fact that one can NOT name 2 times an ALT text with the same keywords or that one should NOT place the word "image" in an image alt text --- all these W3C validation checks do count highly on google!

[edited by: tedster at 11:37 pm (utc) on Oct. 30, 2006]

tedster

1:53 am on Oct 30, 2006 (gmt 0)

alfawolf7, in the spirit of this thread, how have you tested that assertion?

alfawolf7

2:06 am on Oct 30, 2006 (gmt 0)

In the spirt of this post --- here is my answer:
At present Google itself is mostly made with non-valid markup and does not seem to down-rate non-valid markup on indexed pages. Naturally as a result there is a general lack of attention to the format of a page by the very people who could have the most impact on the future of web (as they spend their every waking minute trying to boost their keyword ranking) and avoid the W3C rules. These posts attest well to this.

The turn towards a "partnership" of use between those in SEO and the W3C is achieved when we are able to see how H1 to H6 content is actually part of the check that the validator offers --- that the idea of a page being 99% valid with no major tag errors is certainly a sure way to assure one that the cross browser reading of the page will work IN THE FUTURE when (in 50 or 150 years from now) markup will change and what was not done right will be lost.

You may say "what do I care for my keyword placement when I am dead?" The answer is that if your product, your firm, your essay, your photography, or what ever it is that you have made and placed in ranking is SAVED over time it will be saved owing to its markup.

Value for long long term placement comes by pure logic from the w3c.

nippi

2:09 am on Oct 30, 2006 (gmt 0)

Pure logic, does not have a place in the discussion, or hypthoses on future liklihoods.

Its about what happens, now.

alfawolf7

2:12 am on Oct 30, 2006 (gmt 0)

ok:

Now, the use of proper (www.w3.org/TR/WAI-WEBCONTENT/) - accessibility guidelines define a huge number of factors that we debate.

tedster

2:15 am on Oct 30, 2006 (gmt 0)

Well, I agree that validation is an important step -- especially as you mention, for future-proofing your rankings. But as far as your statement that "all these W3C validation checks do count highly on google" I see no evidence for that. In fact, when I validate the top 10 urls for any search, I see lots of evidence to the contrary.

I'm not saying that you "shouldn't" produce valid code, but rather that Google does not care right now, as long as the errors do not make the mark-up unspiderable. And googlebot does have some decent error-recovery routines. It has to!

This kind of discussion is one of the reasons I started the thread -- to help us sort out the wheat from the chaff.

Is W3C validation a Best Practice for web development? Absolutely. Is validation an important element for Google rankings today? Not one bit.

Marcia

2:25 am on Oct 30, 2006 (gmt 0)

The turn towards a "partnership" of use between those in SEO and the W3C is achieved when we are able to see how H1 to H6 content is actually part of the check that the validator offers --- that the idea of a page being 99% valid with no major tag errors is certainly a sure way to assure one that the cross browser reading of the page will work IN THE FUTURE when (in 50 or 150 years from now) markup will change and what was not done right will be lost.
You may say "what do I care for my keyword placement when I am dead?" The answer is that if your product, your firm, your essay, your photography, or what ever it is that you have made and placed in ranking is SAVED over time it will be saved owing to its markup.
Value for long long term placement comes by pure logic from the w3c.

All that has nothing whatsoever to do with the topic of this thread, which is about factors that affect Google rankings.

What you're asserting does not affect rankings, and it's incumbent on you to state whether your assertions are

-True
-Probable
-Opinion
-Myth

with regard to Google rankings. So which is it to be?

alfawolf7

2:27 am on Oct 30, 2006 (gmt 0)

reply:

true - w3c advanced markup (as mentioned) DOES play a role. As it is ONE AND THE SAME thing. (Valid markup for accessibility guidelines is valid also for SEO!).

[edited by: alfawolf7 at 2:32 am (utc) on Oct. 30, 2006]

Marcia

2:30 am on Oct 30, 2006 (gmt 0)

tedster:

alfawolf7, in the spirit of this thread, how have you tested that assertion?

What kind of testing have you done in order to assert that your assumption is truth?

alfawolf7

2:32 am on Oct 30, 2006 (gmt 0)

I came to this by chance as I run several sites for senior citizens - and made the markup for them - and the keyword ranking for these pages tops all my others.

Question: Does anyone think that there are certain concepts or words that if used in your title and description will be "liked" by google. Crazy?

steveb

3:01 am on Oct 30, 2006 (gmt 0)

Side point to charge past I suppose, but Google does definitely penalize some non valid code, for instance malformed title or description tags.

tedster

3:06 am on Oct 30, 2006 (gmt 0)

I wonder if that's a true penalty, or the bot just can't digest what it was served?

Marcia

3:22 am on Oct 30, 2006 (gmt 0)

Side point to charge past I suppose, but Google does definitely penalize some non valid code, for instance malformed title or description tags.

I've seen instances of malformed head section code (no ending tags) and what happened is that the snippets got messed up - no penalty though. In some cases it may make it impossible to properly parse the page, which isn't the same thing as affecting scoring.

I came to this by chance as I run several sites for senior citizens - and made the markup for them - and the keyword ranking for these pages tops all my others.

Well I did my marketing at Ralph's instead of Von's last time and Google is now the highest traffic source on one of my sites, it's shot up to 60%.

Sequence is *not* not an indication of consequence, it's most often a coincidence. It's faulty logic like that which gives birth to unfounded myths.

Incidentally, Matt Cutts clearly stated in one of his videos that Google has no signal for validated code.

steveb

7:43 am on Oct 30, 2006 (gmt 0)

Pages with malformed tags that should show as first in site: searches show down several spots. When fixed they immediately move to the top.

Pages without descriptions (that can be read) get deindexed and/or dumped into click for omitted results hell. A penalty is having a handicap or disadvantage imposed like this. It doesn't have to be a death sentence.

nippi

10:18 am on Oct 30, 2006 (gmt 0)

I'm with steveb

google penalise for malormed tags, that is, if you stuff the page up it can not make sense of it, or thinks your are trying to scam it.

Yes, it is a breach of standards to leave out your alt tags, and i think google is not keen on this, simply because its bad practice, not describing what the images are.

But no, my testing shows no penalising for not w3c compliant, the top site in my main area of compeition is nothing like compliant, page rank 7.

plenty of compliant competition, no way they could be no1 if compliancy was an issue, 172 fails on their home page alone, and they just aren't that much better than the other sites that they could survice a penalty for this.

M_Bison

10:32 am on Oct 30, 2006 (gmt 0)

I don't think Google applies any penalty to non valid sites.

However.....

- That's not to say they wont in the future.
- Complying to XHTML standards may have it's benefits - eg: alt attributes in img tags.
- In regard to the whole XHTML + CSS - if you do a site redesign, it should only require a CSS update, meaning your users see a shiny new site but Google sees no change.

Oliver Henniges

10:40 am on Oct 30, 2006 (gmt 0)

In contrast to many other of the much more abstract hypotheses tedster mentioned in the parallel thread, these issues concerning w3c-conformity and parsability of a website could relatively easy be tested:

If someone simply launched a dozen pages with various degrees of html-erros at the same time and on the same backlink-graph- and folder- level of a website, we might see after a few days or weeks which one of these pages has been indexed. All you need is some absolutely zero-competitive keywords in each of the pages. However, this would not tell us anything about the impact of html-conformity on ranking. But it would indeed be quite interesting to explore the limits of googlebot's error-tolerance.

The problem is, that the universe of all possible html-errors is actually quite large. Any ideas on how to narrow down this issue?

sandpetra

4:04 pm on Oct 30, 2006 (gmt 0)

if you build to W3c standards your building the site for users. Google (in fact everybody worth listening to) says you should be building for users to become a quality site in your niche.

Although I agree W3C code does not directly outweigh other ranking signals today I agree with Tedster W3C it certainly is good web design practice and there's no evidence IT HURTS.

webdude

5:26 pm on Oct 30, 2006 (gmt 0)

Could someone please show me a page that validates correctly? I have yet to see it...

[validator.w3.org...]

BigDave

5:45 pm on Oct 30, 2006 (gmt 0)

And googlebot does have some decent error-recovery routines. It has to!

From a programmer's perspective, there is a much better solution than recovering from bad markup - simply ignore most markup by throwing out every tag that they don't care about.

When they only look at the tags they care about, those tags can have an impact. They just need to be formed well enough for Google to understand them. It isn't a validation issue, it is an issue about being able to recognize what that specific tag means.

<title>super green widgets</title>
is recognizable even if it doesn't validate. Put it in the body instead of the head. Put <center> tags around it. Add some unknown attribute that doesn't validate. That is how you test this claim, does an invalid version of a well formed title tag still have the desired effect.

The same goes for alt attributes. Yes they help, but can you come up with an invalid one that has the same effect? I bet you can.

There is no need to recover for invalid markup that you are ignoring.

OutdoorMan

6:14 pm on Oct 30, 2006 (gmt 0)

Could someone please show me a page that validates correctly? I have yet to see it...

I have several pages that validates HTML 4.01 Strict correctly (my whole website are build on valid HTML).

But it's only a couple of months old and have been valid even before I published it. So I haven't been able to measure or test if valid MarkUp helps a website in the SERPs.

theBear

6:15 pm on Oct 30, 2006 (gmt 0)

Webdude you mean gets this message

Result: Passed validation

[validator.w3.org...]

Marcia

8:55 pm on Oct 30, 2006 (gmt 0)

Validating or not is NOT the same thing as serious HTML errors that hinder crawling and indexing.

If Matt Cutts has said in a very recent video that Google does not recognize validity, then can someone explain how it's used toward ranking? Or is Matt mistaken? Who wants to say that they are right and Matt Cutts is wrong?

trinorthlighting

10:34 pm on Oct 30, 2006 (gmt 0)

Validation does catch the simple errors and if you validate every page you will know a site is 100% crawlable. That after all is what we all want, right?

Too many times do I see sites with unclosed title tags something a validator would catch. How do you think google penalizes these sites? Would you like to gamble your serps?

Understand how html works and concentrate on proper title tags, description, meta and most important href tags and your content.

If your flash is not validate, that really does not matter since google struggles to index that....

g1smd

11:48 pm on Oct 30, 2006 (gmt 0)

I have tidied non-valid HTML code on many pages on very many sites over the years, and have yet to see one that went DOWN in rankings afterwards. However, in many cases the title tag and meta description were also worked on in some way at the same time... so still inconclusive. Valid code cannot harm you.

theBear

12:06 am on Oct 31, 2006 (gmt 0)

"Validating or not is NOT the same thing as serious HTML errors that hinder crawling and indexing."

A true statement Marcia.

However many (as in most, but not all) such errors are caught by a validation system such as W3C makes availible.

tedster

12:08 am on Oct 31, 2006 (gmt 0)

There's also something about taking on the discipline of valid code (and learing what that means in detail) that improves a more disciplined and thoughtful approach to coding altogether -- and that spills over into many potential areas for improving ranking. I know that until I got serious about validation, and then about strict html/xhtml my understanding of exactly what html is was fuzzy at best.

Validating forces you to learn the academic roots of html -- and that helps because Google and other search engines by their very nature MUST come at their job from a strict academic and theorectical angle. So, as I see it, the more you understand about HTML the more you will know about how to send the clearest possible signal to the search engines.

So it's a spill-over effect that improves your rankings, and not a direct algo component. That same kind of spill-over happens when you take on accessibility disciplines as well.

The error in thinking that needs to be undone is considering html as a kind of layout language. It isn't that at all, it's a "mark-up" language that assumes a starting document and then adds "mark-up" to clarify the various portions of the document in a semantic sense.

The rendering of a document, whether visual, aural or whatever, is not involved in the core discipline of html. And search engines are looking for those semantically clear signals for the relevance of your document.

texasville

12:19 am on Oct 31, 2006 (gmt 0)

>>>>>Is validation an important element for Google rankings today? Not one bit<<<<<

100% agree.

In the course of the past two years, I am sure I have looked at the source code of over 1000 web sites mostly in top ten rankins. It's amaing how many would not validate. It's amazing how many break in mozilla and safari, yet are in top ten and some are #1. It's incredible how many don't even have a doctype. The navigation on many is appalling. If it wasn't for the back button....well you understand what I am saying.
No- google could care less if it validates. Sometimes I wonder if it doesn't help. Another facet of the dumbing down?

goubarev

1:49 am on Oct 31, 2006 (gmt 0)

To clear some points:

The original question was if "non W3C compliant code will harm the site's ranking" - I've said it was a myth.

On the other side: of course, having a comliant code will only help.

alfawolf7, the W3C validation that I meant is the one they have on this page [validator.w3.org...]

(Having outbound related links helps - True ;c)

Another point, as Marcia pointed, was validation vs HTML errors. In the original post I've assumed we're dicussing the differences between old HTML and new HTML - and not the HTML errors (validator catches those too) - so:

HTML errors will hurt site's ranking - True
Exmple: one webmaster opens the <a href> and never closes it - how does the spyder know there to close it?

optimist

4:31 am on Oct 31, 2006 (gmt 0)

I've been doing SEO since 1999. Not one site I ever did validates. Of all these site many have top 3 rankings for multiple terms.

So I say the Validation helping SEO is a myth. Personally I find good old fashioned HTML 4.0 to rank better than any XHTML site I ever did with the same techniques.

I believe table based HTML can rank better than DIV based code or at least the way I write it makes a difference.

I think its now a preference on whether or not you want your code to validate. Old Javascipts also have a tendency to not validate.

This 33 message thread spans 2 pages: 33