| This 34 message thread spans 2 pages: 34 (  2 ) > > || |
|HTML and CSS - is W3C Validation Realistic?|
I am cranking out specs for a fairly complex web 2.0 style site that will have tons of interactivity, ajax, and various bells and whistles.
I was just going trough my high level requirements and started to wonder if I should put "Site should W3C HTML and CSS validate"
Is a realistic requirement? Is it worth the extra effort and cost if possible? I looked at a ton of popular sites and did found one that validated - craigslist. So I am wondering if a complex site that looks great across browsers and works smooth like 2.0 butter can be made with 100% valid coding or would I need to settle for a craigslist look and feel site?
Since this is a ground up build, now is the time to decide. Thoughts?
I've never understood this particular debate. All code should be valid - period.
The only reason code doesn't validate is because the author did not know what they were doing. They failed to read the HTML/XHTML/CSS instructions and built the site based on what they see in "their browser".
I've had two clients referred to me this year whom I've sent back to the developer asking them to correct their markup errors. On one site, there were 300+ errors per template, many of them parsing errors, the ones that can be fatal if you ask me.
|I was just going trough my high level requirements and started to wonder if I should put "Site should W3C HTML and CSS validate" |
I think you should put...
The year is 2010, if the folks producing markup are still writing borken code, then you probably shouldn't be working with those types.
HTML/XHTML is really too simple to break but yet folks continue to do it day in and day out - I just don't understand why? I can only assume that they just don't know and/or care. The site looks fine in the browsers they tested. But then they wonder why their Google cached pages are blank. I wonder how that happens? Don't ask me. :)
I have to agree with pageoneresults 100%. Yes producing valid markup can be frustrating at times but it is such a basic and fundamental concept of development. The easiest way to maintain tight control over code is to write it by hand, that's probably why so few pages validate these days. Too many developers are relying on WYSIWYG editors. The more you write by hand the easier it gets to crank everything out as valid markup according to the doctype that was chosen for that page.
Keep in mind that every page doesn't have to use the same doctype. If a technology applied on a certain page doesn't validate to that doctype then maybe back it up by a version or two until it validates. Modern browsers will render just about anything you can throw at them which is why so many developers take a relaxed approach to ensuring valid markup. But without doubt a valid page will load faster, perform better, and won't break pages in assistive technologies.
This subject is a sensitive one for me because I am a strong advocate of W3C standards and when I look around the web at what's getting pumped out it's frustrating. If someone can't accept it for accessibility issues then look at it from an SEO point of view. Every valid page that links to W3C's validator gets 2 links back (bookmark if you set it up properly such as example.org/check?uri=http%3A%2F%2Fexample.com%2F) from a PR10 (root of site) domain. If someone thinks that search engines don't take that into consideration, think again. Invalid pages do not get a link back to the referring site. That's why I find it so funny when I see all those "Valid W3C" images at the bottom of pages that do not validate -- they are just giving away their PR.
Typically my client sites (very small sites of 30 pages or less) will grab a PR3 within about 3 months of go live. Some of those sites have very few backlinks and in some cases relatively little content. So what pushes them to the top of results for their targeted terms? The only difference I've been able to determine is 100% valid markup on every page and backend server performance.
My site as an example; I only have 37 pages but almost everyone has about 1500-2000 words of practical original content. Every page is 100% valid XHTML 1.1 and also gets a 100/100 Page Speed score (but I place less emphasis on that). I am in an EXTREMELY competitive market and I only have about 35 backlinks, just launched recently -- and my PR is 3. The company that holds spot #1 for my targeted search term has been in business for 12 years according to their site, and they have 115,000 backlinks according to yahoo site explorer, and only has a PR4. So where does the hugh gap come into play? My guess is valid markup and server performance. I know many developers are overlooking an easy way to place higher in SERPs by ignoring producing clean markup. Those are the little things that we have control over rather than trying to gather 100,000 backlinks.
Yes. Valid code saves time. It might take longer to initially develop, but it will save you a lot of time otherwise wasted when months down the track, and using non-valid code, a seemingly inconsequential code change totally breaks your site.
One argument against valid code is when the likes of Google, serving gigabytes per minute, drastically simplify code to save bandwidth. There's way less than 100 sites on the planet that that argument can be applied to with any validity.
HTML code in common use consists of less than twenty elements and a dozen or so attributes. With this level of simplicity, there's no excuse in getting it wrong.
|Is a realistic requirement? |
Let's do a level set here. I couldn't write code to save my life, yet I can make all my pages validate. It's just a matter of reading... In fact, it's pretty rewarding for someone like me to clean up code. So why wouldn't an actual coder be able to do the same?
|Is it worth the extra effort and cost if possible? |
If someone is starting from scratch, why in the world would it cost more to have the pages validate? It's probably a good quality screening question. - Do you ensure all your pages validate? Most coders would be afraid of losing the job by saying no.
|I've never understood this particular debate. All code should be valid - period. |
Webmasterworld does not validate, Google does not validate, Bing, Yahoo, Microsoft, Amazon, Youtube, Facebook, Wikipedia, none of them make it. The message I get from this is that if they don't think it's important why should anyone else?
We should all of course strive for 100% valid markup (and I do) but sometimes it is just not possible. Validating HTML is easy but if you are running external scripts, affiliate schemes or even Adsense then this may well cause your website to fail.
(The good old BBC is one of the few popular websites that does manage to validate incidentally.) :)
|The easiest way to maintain tight control over code is to write it by hand, that's probably why so few pages validate these days. Too many developers are relying on WYSIWYG editors. |
This is so wrong.
Perhaps you have not used modern WYSIWYG editors but good ones actually help and provide direction with validation. I predominantly use Dreamweaver and when I complete a site it is an extremely simple task to run the built in validation tool on a per page or whole site basis. It then lists any discrepancies and even tells you where you have gone wrong. All you have to do is click into these and fix them.
This is much quicker and more efficient than relying on manual validation.
Thanks. I think my question has been answered. I'll add this to the requirements.
I disagree! Why should your code validate? As long as it renders well in the top browsers who cares. I don't.
Google, Bing and Yahoo don't care if my pages validate and 99% of my viewers don't even know what validation is about.
WHY should my pages validate if all the main browsers (maybe even 100% of browsers) show my pages as I want.
The search for perfection detracts from the main objective of a website which is to give users the information they want in an attractive format.
Having an old school looking website was my main concern. But it seems even nice 2010 style sites can validate. None of the big 2.0 sites validating is a concern. There must be a reason when some of these sites have all the money in the world to do it right choose not to.
Why, easier to spider, less chance of breaking down the road, chance of better rankings - Plus perhaps a link from w3c editor picks section :-)
what google says...
|While your site may appear correctly in some browsers even if your HTML is not valid, there's no guarantee that it will appear correctly in all browsers - or in all future browsers. The best way to make sure that your page looks the same in all browsers is to write your page using valid HTML and CSS, and then test it in as many browsers as possible. Clean, valid HTML is a good insurance policy, and using CSS separates presentation from content, and can help pages render and load faster. Validation tools, such as the free online HTML and CSS validators provided by the W3 Consortium, are useful for checking your site, and tools such as HTML Tidy can help you quickly and easily clean up your code. (Although we do recommend using valid HTML, it's not likely to be a factor in how Google crawls and indexes your site.) |
So basically they recommend it but not for themselves ;)
BDW - thanks for saving me the time of fnding this citation. This is exactly why my site validates. Honestly folks, it's not that difficult.
|easier to spider, less chance of breaking down the road, chance of better rankings |
So you knew it all along, didn't you? ;-)
|Why should your code validate? As long as it renders well in the top browsers who cares. I don't. |
I don't know, but every time I validate code of my new pages the rendering problems in different browsers seem to disappear.
I validate code because it is easy to do and it takes care of some nasty time consuming problems.
|As long as it renders well in the top browsers who cares. I don't. |
So, you'd be quite happy to serve a page that seemingly "looks OK" but which hammers the error-correction routines in the users browser ten thousand times per pageview?
And, you'd not care about the user-experience, or the page rendering speed?
At what number of reported errors per page would you decide that some tidy-up is required? Ten, a hundred, five hundred?
What malformed coding constructs would you allow, and which would you fix? To produce consistent work, you'd have to define those in a list - and by doing that you have just defined your own (albeit broken) non-standard "version" of HTML.
Do you have the same cavalier attitude towards your content: not fixing spelling and grammar errors, not fixing broken links, etc? If you fix the content, why not also the code?
If you don't fix the content, then how many spelling errors per page can you have before you decide to do quality contol? Five, fifty, two hundred, more?
All my sites validate HTML and CSS but I've never found it an advantage in terms of ranking and in fact I've sometimes found it a disadvantage in terms of cross browser and platform compatibility. I railed against it for years as a geek badge, but have come round to thinking of it as a sort of kitemark that demonstrates observance of certain standards. As far as I can see there is absolutely no ranking or other discernible benefit from validating your pages other than the kudos for doing so. I do it because I can but if I ever had to sacrifice validation for an important user feature or SEO benefit I would not think twice. I think the most important thing about thinking along these lines is that it also makes you think about page structure, text content etc. If you're going to devote attention to good markup then you should also devote attention to good content.
|Google, Bing and Yahoo don't care if my pages validate. |
Their bots do. Also, Bing have publicly stated that well formed and valid markup is of benefit from an indexing perspective.
|And 99% of my viewers don't even know what validation is about. |
Rightly so. They are seeing the site. On the other hand, the bot is machine reading the site. Presentation vs. Markup. The markup SHOULD be written using well formed machine readable grammar. You can't have well formed without valid markup. If you don't believe me, try running a semantic extraction routine on your documents.
Semantic Data Extractor
|WHY should my pages validate if all the main browsers (maybe even 100% of browsers) show my pages as I want. |
Because out of the box they should be perfectly valid. If you've followed the instructions for the markup you are working with then everything should be valid.
I suggest an HTML Strict 4.01 or XHTML 1.0 Strict or HTML5 DTD. These days, everything we do is HTML5.
|The search for perfection detracts from the main objective of a website which is to give users the information they want in an attractive format. |
Users = Humans + User Agents.
User Agents find well formed and valid markup attractive.
Perfection? Just following the instructions for the markup I'm working with - that is all. :)
|As long as it renders well in the top browsers who cares. I don't. |
But what happens when you change "browsers" to the more general word, "user-agents"? Now you have included googlebot and bingbot -- and you can't know for certain HOW they see the page, or more precisely, how the final routines at the search engine will deal with the data they retrieve. You don't know how well they will recover from your particular errors.
I don't care much about 100% validation in its own right. If a "strict" page doesn't validate but a switch to "transitional" would mean the source code is valid, that's no biggie.
But I do care about understanding any error that shows up. Most of them I fix, but once in a while I just ignore one because I know why it's there. It just might be the kind that browser error recovery forgives but search engine error-recovery doesn't. And in that case, some section of the page may not be indexed and tagged properly.
one other thing to consider is that you won't necessarily "see" in your browser if the DOM is broken/valid, a situation that rarely exists with validated code.
having a valid DOM is very important when your site is a "complex web 2.0 style site" and you're doing ajaxian things to it.
|Do you have the same cavalier attitude towards your content: not fixing spelling and grammar errors, not fixing broken links, etc? |
That's a bit harsh is it not? We are talking about HTML validation and you are assuming (or speculating) that there is a correlation between this and grammatically correct copy. There is no relationship between the two.
|But I do care about understanding any error that shows up. Most of them I fix, but once in a while I just ignore one because I know why it's there. |
Absolutely! I do too and I think this is the pragmatic approach.
None of you purists have explained why Webmasterworld does not validate, Google does not validate, Bing, Yahoo, Microsoft, Amazon, Youtube, Facebook, Wikipedia and a host of other cutting edge sites don't validate and why it does not seem to do them any harm?
|I do too and I think this is the pragmatic approach. |
And from my experience it is. Sometimes it maybe hard to identify opened/closed elements in which case running the w3c helps to quickly fix the HTML.
And I have seen sites scoring high for keywords because they use broken HTML on purpose, to fool spiders.
|having a valid DOM is very important when your site is a "complex web 2.0 style site" and you're doing ajaxian things to it. |
And I run into cases where my valid HTML/CSS and jscripts/ajax were trashed on the client end because some ISPs have the bright idea to strip certain content via compression. Seems to me broken HTML becomes more and more popular.
I think its good to validate and understand if the w3c errors are relevant or not to what you want to do, not because of a couple of fancy icons with the pages.
|None of you purists have explained why Webmasterworld does not validate, Google does not validate, Bing, Yahoo, Microsoft, Amazon, YouTube, Facebook, Wikipedia and a host of other cutting edge sites don't validate and why it does not seem to do them any harm? |
Before you make a blanket statement like that, you should double check those sites for validation.
Google does validate. Not all pages but I've watched as they've slowly changed to an HTML5 DOCTYPE and are cleaning up a bulk of the markup errors. I've been tracking this since 2009 September.
The Wiki does validate. At the moment, they have 1 error being reported. A couple of weeks ago, they were valid. It's a touch and go situation for them.
Yahoo!? Pffft, they don't know what valid markup is. Amazon? Poster child for broken markup with 600+ errors.
You cannot continue to use this as an argument for writing broken code. In fact, it sort of makes you look really unedumucated when you use that as an excuse.
So tell me, why does your code have to be broken again?
WebmasterWorld? Let's not go there. It pains me to see my favorite hangout not valid. Don't ask me why it isn't either, I couldn't tell you. Well, I could, but I'm surely going to piss a few off in the process. They know who they are. If they don't understand validation, you can call me, I'll do a consult gratis so you can fix your tag soup. ;)
|And I have seen sites scoring high for keywords because they use broken HTML on purpose, to fool spiders. |
Oh come on! That's even a lamer excuse than the other one being used. That is total FUD!
P1R I did check all their home pages. They didn't validate and they still don't validate. Remember you are the one who said, "All code should be valid - period" so getting close to validation is still a failure.
|Before you make a blanket statement like that, you should double check those sites for validation. |
I didn't say that and I don't deliberately write broken code. I did say that we should all aim for 100% valid markup but IMO it is just not as important as some of you are making out.
|You cannot continue to use this as an argument for writing broken code. |
I used the list of big sites to illustrate the fact that if people who are much more knowledgeable than me don't get their knickers in a twist over this then I am not likely too either. ;)
Why does it pain you? Do you see any performance issues related to the fact that it does not validate and does it in any way spoil your user experience?
|WebmasterWorld? Let's not go there. It pains me to see my favorite hangout not valid. |
|Google does validate. Not all pages but I've watched as they've slowly changed to an HTML5 DOCTYPE and are cleaning up a bulk of the markup errors. I've been tracking this since 2009 September. |
How many pages did you check?
|Oh come on! That's even a lamer excuse than the other one being used. That is total FUD! |
Unless you know the search engine's internals and you have access to the code how can you be so certain? I saw it happening and that's what I posted. For sure page content is parsed by spiders and interpreted in many different ways. So it's possible a flaw in the parsing engine of the spider cam make it discard or promote certain content.
And FYI you can find lots of pages in w3.org that don't validate. And of course the w3 validation tool has its own bugs. So w3 validated HTML doesn't necessarily mean will work on all browsers and spiders. But as mentioned is good to use for checking and fixing errors.
I have been running websites since 1994 and we rarely validate because 90% of the ads we run contain code that never vaildates. We also sell quite a few books and other products via Amazon links and those links never validate. I had an error on the main page of one of my sites today so I ran the W3C validator. It found 83 errors. And 82 of them were ad code and Amazon links. After wading through all that I did find the one html error munging the page and fixed it. Based on intermittent tests we did over the years, our pages would validate 100% most of the time if we simply dropped our income-producing ads and links. Ha! Fat chance...
Absolutely! One of my main problems with validation is third party HTML and scripting.
Fix the stuff you can fix. Third party stuff is often in a i-frame and therefore has little effect on the code flow of the main page.
|Validating HTML is easy but if you are running external scripts, affiliate schemes or even Adsense then this may well cause your website to fail. |
You are allowed to alter the adsense code for it to validate. They allow this.
|(The good old BBC is one of the few popular websites that does manage to validate incidentally.) |
Try their news site. It fails.
|I disagree! Why should your code validate? As long as it renders well in the top browsers who cares. I don't. |
As an avid fan of IE6, when I browse the net I am coming across an increasing amount of sites that crash my browser. Every page this happens to does not validate. Sure it works in other browsers, but if that happened to my site, it would upset up to 20% of my visitors.
You may not care. My [ex] window cleaner didn't care either. He thought our windows were round because he never did the corners. Sure I could see out of them, but when I pay for a job, I expect it to be done properly.
when search engines hint that you get a boost for valid markup i'm sure they only mean the obvious stuff.
eg, we all know they link text is a factor, so if you have <a href="blah.html">blah</a> that's going to help for the word 'blah'. but if you forget to close the tag, it won't. that's what they mean.
i'm sure they don't care a bit when it comes to other stuff, like whether you've put a <div> inside a <span>.
A couple years ago I came here looking for help. Traffic to my sites had dropped radically.
I went through all sorts of drills making change after change recommended by people here. Nothing worked.
I may not have followed instructions properly, but the hundreds of hours spent could probably been spent elsewhere.
I finally gave up and worked other sites.
Some time later I was read the Bing guidlines and noted that their first bit of advice was to be sure a site has validated code.
So I made no other changes to the site than validating the code.
Within two weeks of resubmission of the site map for the newly validiated site, traffic went from about 600 visitors per day back to near 5,000 per day.
Maybe it was a fluke.
So I started on a second site that had dropped from about 2500 per day to about 400 per day.
With that site about 200 of 600 pages validated, traffic is inching up. It is now about 1200 per day.
With the first site I made no content changes, no navigation changes, no nothing. With the second site I am upgrading content and changed the page layout including the navigation.
I am certainly no SEO expert. All I know is that the W3C validator and the built in Tandy feature seem to be working for me.
| This 34 message thread spans 2 pages: 34 (  2 ) > > |