My pages are now W3C validated. Will this help on the search engines?

Forum Moderators: open

Message Too Old, No Replies

My pages are now W3C validated. Will this help on the search engines?

Larryhat

12:46 pm on Sep 18, 2004 (gmt 0)

Hello all: It was a tough job, but I have W3C validated most of my pages now. The validator error messages in "verbose" are somewhere between worthless and misleading.

I'm glad its nearly done for several reasons. I cleaned up several of my own markup messes for one thing.

Here's my question:

Might all this work have a positive impact on my rankings in Google and/or Yahoo Serps? Maybe some small advantage I would have missed by not validating?

Or, do G and Y frankly not care one way or the other?

Has anyone ran any experiments along these lines?

Your comments are eagerly awaited. While I'm waiting, I will try and "validate" some junk pages showing higher for my keyword(s). That should tell me something.

Best wishes - Larry

[edited by: tedster at 8:50 pm (utc) on Sep. 18, 2004]

encyclo

6:26 pm on Sep 20, 2004 (gmt 0)

Checking the list (I changed the order a bit for clarity).

> Yahoo does not validate.
> Google does not validate.

Neither site needs to worry about being spidered at all.

> Microsoft does not validate.
> Ebay does not validate.
> Amazon does not come close to validating.

Whilst these sites won't mind being spidered, none of them need to worry too much - their backlinks are overwhelming, and their business models don't depend on the bulk of their pages being well-ranked in the SERPs.

> Wired validates
> Webmasterworld does not validate.
> Blogger does not validate.

They are all well-formed HTML - which as the messages above mention is the most important aspect of the equation - rather than worrying about a few missing trailing slashes on Blogger or the odd nesting problem on this site (which includes mostly content inserted by outsiders).

It depends what game you're in - and if you want to help out the bots on your site and make sure it gets spidered properly, validation (for well-formedness) is a no-brainer.

Google itself does not validate, yet their website renders properly perhaps in more browsers than 99.9999% of all websites

So, you've never seen the flaky Adsense pages (which are so badly-written they sometimes don't display at all), and you don't use Gmail (which works only on a very select group of browsers).

Reid

6:54 am on Sep 21, 2004 (gmt 0)

validation can catch some pretty strange things that many browsers will ignore.
I bet most spiders are not so forgiving.

eg a missed key ? instead of >

on one of my pages there was a stray </center> laying around in one of my data cells. Left over from a cut and paste miss.
W3c validator interpreted this as </table>

tedster

8:37 am on Sep 21, 2004 (gmt 0)

Browsers used to use an essentially dumb form of error recovery for tag nesting - and I wouldn't be surprised to see this kind of simplistic logic still alive in a couple of spiders today.

To get around nesting errors, earlier browsers just assumed that whatever closing tag they ran into was closing the most recently opened element, no matter what the closing tag actually was. This kept the browser's internal routines from crashing - but that assumption could really subvert what the page author THOUGHT they were putting out.

mona

6:18 pm on Sep 21, 2004 (gmt 0)

I can't offer an answer to Larryhat's question, but I just had to chime in with this:
Great discussion, people!
Claus - that might be a top 10 all-time WebmasterWorld post for me. Your ideas have made me see SEO and the web in a different light. Inspiring:)

ppeter

12:48 pm on Sep 24, 2004 (gmt 0)

Bit late catching on with this topic, but I've just read through it all and I'm very impressed - a great thread. **insert Claus praising here**

One thing I noticed is that the post mainly focuses on how validation/standards conformance can help inclusion but no-one seems to have touched on relevancy which after all is most of what SEO is about.

The W3C, as explained before are looking towards the 'semantic web', accessible_by_all. The idea is that the standards will be adopted by web sites and browsers eventually meaning that all sites will work as expected in all browsers - making the web more accessible to the world!

Google shares this same vision - its aim is to make the web more accessible through its search. So it makes sense that Google should favour sites that validate correctly because there is more chance the website will be accessible than a non-validating page.

Obviously, this is one of many many factors to take into consideration and so one may not see a great deal of benefit, but if everyone works towards the semantic web (the online dream!) the web will be reachable by more people, more easily.

claus

8:34 pm on Sep 24, 2004 (gmt 0)

>> meaning that all sites will work as expected in all browsers

Oh, if i should add one thing, that would be exactly it:

7) From Documents to Data
No your browser will not be "your browser" anymore. In stead, the structured documents that you have carefully created, the web sites they're on, and indeed the whole "library" known as "the web" will be one big database full of data - accessible through a multitude of items and user interfaces.

Working with data structure is essential - as there's not one centralized body able to convert all websites out there, it's necessary to suggest standards and urge people to follow rules for what types of content goes where, and when.

That, in turn, is the role of W3C. So, here's the news: The database that we're all creating is not perceived as having only one user interface, ie. a browser (or even a PC). The user interface can also be, eg. a spreadsheet, a RSS reader, an email client, a cell phone, a TV set, a media player, a cash register, an ATM, a fridge, or even a watch, a jacket, or a pair of shoes.

woah, that's sci-fi isn't it? Uhm... not really. Regarding alternative user interfaces, there's actually a lot of them already. Eg. i've got a standard spreadsheet with financial data that updates itself with the most updated information from selected web pages as i open it - that's a good deal more useful that looking through all these pages and entering the figures in the spreadsheet manually.

Only, by and large, the data is still "corrupt" - that is; it's very hard to turn web pages into information without having to actually read and comprehend them.

Consider the new thing that Google introduced way-back-when: It looked at pages, not web sites. The "Search Engine" in this scenario will go one step further - it will look at "specific information", not "pages". It will not care about your widgets web site, nor will it care about your blue widgets page, in stead it will go directly for the price, size and quantity of "widget X". You would not ask "give me the most relevant pages on widget X" - in stead you would ask "give me all the makes and models of widget X, sorted by price and location of dealer".

Yes, Froogle is one step in that direction, and so is local search, the "define:" tool, the calculator and so on, but... boy, we have a loooong way to go with data quality on those billions of pages out there.

g1smd

8:31 pm on Sep 28, 2004 (gmt 0)

Why would anyone set out to write non-valid code anyway, when it is so easy to fix it?

Yes, the W3 validator might as well be talking in Klingon or Vogon when you are new to it, but in helping to fix several hundred sites in the last 4 or 5 years I have realised that 90% of the HTML errors are the same dozen errors every time, and the next 9% are another dozen or so. So by learning about two dozen fixes you can correct about 99% of all of the problems out there.... yeah, really!

My experience is that in making sure that all pages consist of headings, paragraphs, lists, tables and forms, and in moving all CSS and JS out to external files, and making sure that the <title> and <meta description> are good, headings are used sensibly, and the code validates, that the pages rocket up the SERPs a few days or weeks later. Valid code cannot harm you, and might be helping. Non-valid code might be harming you big time!

This Google Search [google.co.uk] always amazes me.

I especially like the ones with a /title> tag.

ppeter

6:36 pm on Sep 29, 2004 (gmt 0)

Why would anyone set out to write non-valid code anyway, when it is so easy to fix it?

Ahh my friend, here is the problem....

In my experience web design agencies are exactly that - DESIGN agencies. Some of them havn't evern heard of the W3C! I once asked an agency if they could do a print stylesheet for a site I wanted designing. Their answer? "A what now?"

Thing is most companies don't know anything about site design so they think they're getting a great looking website but what they also get is a nice looking site thats badly put together which will cost them a bomb to make search engine friendly (the reason I'm with an SEO agency ;) )

This 38 message thread spans 2 pages: 38