Forum Moderators: open
Your second question is an interesting question. I'm not aware of any studies that would answer it though I'd be curious to see what the demographics are.
statistics showing the percentage if pages being correct HTML
Just a guess here, but taking the number of web pages at 3 billion, the number of VALID pages is going to be far, far less than 1% right now. 1% would be 30 million valid pages, and I doubt that it's anywhere near 1 million.
A lot of the reason is lack of awareness. Some of the reason is also practical, related to Content Management Systems and other parts of the development process. Many organizations simply need their content out there, fresh, early and often. If that can't be done easily, even transparently with the tools they have on hand, then they won't bother getting the code to validate, as long as it displays well in most browsers. For now validation can appear to be a unnecessary drain on resources.
The move to valid code is really the cutting edge. Ultimately it will allow the Internet medium to grow and mature, and it makes a heck of a lot more sense than for browser makers to build in what they should think is the proper amount of forgiveness.
In what other technical medium do we expect the "player" to forgive improper encoding? The Internet cannot mature beyond a certain limit while cowboy code is the rule. So we're at the very beginning of a change. How fast we each get with the change is a personal choice or a business choice. For now, the game is mainly about raising awareness that validation IS a choice.
There is also a carrot involved here, not just a stick. Do you want search engines to gobble up your content? Validate the code. Do you want to have solid instructions that will give good results on all kinds of devices? Validation is the answer.