Forum Moderators: open
I'm glad its nearly done for several reasons. I cleaned up several of my own markup messes for one thing.
Here's my question:
Might all this work have a positive impact on my rankings in Google and/or Yahoo Serps? Maybe some small advantage I would have missed by not validating?
Or, do G and Y frankly not care one way or the other?
Has anyone ran any experiments along these lines?
Your comments are eagerly awaited. While I'm waiting, I will try and "validate" some junk pages showing higher for my keyword(s). That should tell me something.
Best wishes - Larry
[edited by: tedster at 8:50 pm (utc) on Sep. 18, 2004]
> Yahoo does not validate.
> Google does not validate.
Neither site needs to worry about being spidered at all.
> Microsoft does not validate.
> Ebay does not validate.
> Amazon does not come close to validating.
Whilst these sites won't mind being spidered, none of them need to worry too much - their backlinks are overwhelming, and their business models don't depend on the bulk of their pages being well-ranked in the SERPs.
> Wired validates
> Webmasterworld does not validate.
> Blogger does not validate.
They are all well-formed HTML - which as the messages above mention is the most important aspect of the equation - rather than worrying about a few missing trailing slashes on Blogger or the odd nesting problem on this site (which includes mostly content inserted by outsiders).
It depends what game you're in - and if you want to help out the bots on your site and make sure it gets spidered properly, validation (for well-formedness) is a no-brainer.
Google itself does not validate, yet their website renders properly perhaps in more browsers than 99.9999% of all websites
So, you've never seen the flaky Adsense pages (which are so badly-written they sometimes don't display at all), and you don't use Gmail (which works only on a very select group of browsers).
eg a missed key ? instead of >
on one of my pages there was a stray </center> laying around in one of my data cells. Left over from a cut and paste miss.
W3c validator interpreted this as </table>
To get around nesting errors, earlier browsers just assumed that whatever closing tag they ran into was closing the most recently opened element, no matter what the closing tag actually was. This kept the browser's internal routines from crashing - but that assumption could really subvert what the page author THOUGHT they were putting out.
One thing I noticed is that the post mainly focuses on how validation/standards conformance can help inclusion but no-one seems to have touched on relevancy which after all is most of what SEO is about.
The W3C, as explained before are looking towards the 'semantic web', accessible_by_all. The idea is that the standards will be adopted by web sites and browsers eventually meaning that all sites will work as expected in all browsers - making the web more accessible to the world!
Google shares this same vision - its aim is to make the web more accessible through its search. So it makes sense that Google should favour sites that validate correctly because there is more chance the website will be accessible than a non-validating page.
Obviously, this is one of many many factors to take into consideration and so one may not see a great deal of benefit, but if everyone works towards the semantic web (the online dream!) the web will be reachable by more people, more easily.
Oh, if i should add one thing, that would be exactly it:
Working with data structure is essential - as there's not one centralized body able to convert all websites out there, it's necessary to suggest standards and urge people to follow rules for what types of content goes where, and when.
That, in turn, is the role of W3C. So, here's the news: The database that we're all creating is not perceived as having only one user interface, ie. a browser (or even a PC). The user interface can also be, eg. a spreadsheet, a RSS reader, an email client, a cell phone, a TV set, a media player, a cash register, an ATM, a fridge, or even a watch, a jacket, or a pair of shoes.
woah, that's sci-fi isn't it? Uhm... not really. Regarding alternative user interfaces, there's actually a lot of them already. Eg. i've got a standard spreadsheet with financial data that updates itself with the most updated information from selected web pages as i open it - that's a good deal more useful that looking through all these pages and entering the figures in the spreadsheet manually.
Only, by and large, the data is still "corrupt" - that is; it's very hard to turn web pages into information without having to actually read and comprehend them.
Consider the new thing that Google introduced way-back-when: It looked at pages, not web sites. The "Search Engine" in this scenario will go one step further - it will look at "specific information", not "pages". It will not care about your widgets web site, nor will it care about your blue widgets page, in stead it will go directly for the price, size and quantity of "widget X". You would not ask "give me the most relevant pages on widget X" - in stead you would ask "give me all the makes and models of widget X, sorted by price and location of dealer".
Yes, Froogle is one step in that direction, and so is local search, the "define:" tool, the calculator and so on, but... boy, we have a loooong way to go with data quality on those billions of pages out there.
Yes, the W3 validator might as well be talking in Klingon or Vogon when you are new to it, but in helping to fix several hundred sites in the last 4 or 5 years I have realised that 90% of the HTML errors are the same dozen errors every time, and the next 9% are another dozen or so. So by learning about two dozen fixes you can correct about 99% of all of the problems out there.... yeah, really!
My experience is that in making sure that all pages consist of headings, paragraphs, lists, tables and forms, and in moving all CSS and JS out to external files, and making sure that the <title> and <meta description> are good, headings are used sensibly, and the code validates, that the pages rocket up the SERPs a few days or weeks later. Valid code cannot harm you, and might be helping. Non-valid code might be harming you big time!
This Google Search [google.co.uk] always amazes me.
I especially like the ones with a /title> tag.
Why would anyone set out to write non-valid code anyway, when it is so easy to fix it?
Ahh my friend, here is the problem....
In my experience web design agencies are exactly that - DESIGN agencies. Some of them havn't evern heard of the W3C! I once asked an agency if they could do a print stylesheet for a site I wanted designing. Their answer? "A what now?"
Thing is most companies don't know anything about site design so they think they're getting a great looking website but what they also get is a nice looking site thats badly put together which will cost them a bomb to make search engine friendly (the reason I'm with an SEO agency ;) )