Forum Moderators: open
That would seem like a great thing to do though... considering most of the spammy pages I see are so full of formatting errors and the like, they churn these puppies out like there's no tomorrow. Giving a higher relavancy to pages with validated code would make sense... keeping in consideration the fact that SPAM pages usually have crumby code.
Then again, maybe they could give searchers the option to turn on a "Valid Only" filter... the traffic data from that would certainly give webmasters an idea how many people cared about valid code in real-life search situations.
I think that all SE's look kindly on this strategy. Pages with the content first are giving the spiders what they want - content - instead of a bunch of table tags or navigational javascript.
There is a theory that the spiders have limited CPU resources due to the vastness of the web that they are crawling. Because of this, some believe that spiders don't always spider the entire page. So if the first 100 lines of your page are javascript and table tags, the spider thinks your page is about nothing relevant, and hence you don't rank well. Separating the content solves this problem.
It isn't a case of Google awarding you bonus points because your page validates. (I don't think they would start doing that until google.com validates. ):)
The boost you get from clean code is really more of a bi-product of helping a spider do its job correctly. Code that validates dramatically reduces the chances that there will be an error when the document is parsed. A couple mistakes in your code can often cause 50k of Javascript to get stored in the database, instead of your 500 word keyword rich article.
External CSS is even better because removing all the presentation code off the page means that you make it almost impossible for even the dumbest of spiders to make a goof.
Googlebot will grab 100k, so I wouldn't worry about it being just the 1st 100 lines.
WebGuerilla:
> Code that validates dramatically reduces the chances that there will be an error when the document is parsed.
Definitely, but Googlebot can trip even on valid HTML such as > appearing in attribute values (at least it did a few weeks ago when last I checked).
I don't think that a valid page will rank any better than a non-valid-but-still-works-ok page. However, as has been said, if the code is so riddled with errors that the spider doesnt get to the content, you've had it.
Whats the best way to find out if this happens to your page? VALIDATE IT! The W3C validator is great at exposing faults that will kill a spiders attempts to read a page. Plus, valid code is cool :)
Or maybe not. Either way, it'd be nice to see something to encourage more people, especially businesses, to write cleaner, validating code. Of course, Google itself would have to start validating first, lest they suffer cries of hypocrisy.
I don't want to remove the tracking script as it is providing very valuable information right now. Everything else on the pages validate with the exception of the <noscript> tag that is used for the tracking script. I now have three errors to contend with that won't allow me to validate 100%. What should I do?
Line 274, column 50:
button5.asp?tagver=5&si=******&fw=0&js=No...
^Error: unknown entity "si"
Line 274, column 50:
button5.asp?tagver=5&si=******&fw=0&js=No&"></noscr...
^Error: unknown entity "fw"
Line 274, column 50:
button5.asp?tagver=5&si=******&fw=0&js=No&"></noscript>
^Error: unknown entity "js"
I've sent an email to technical support explaining the situation to them along with the results of the validation. Is this something that I can fix myself?
I've been tweaking tracking and advertising scripts for validation purposes for years, and no tracking or advertising company has ever noticed or complained.
P.S. I'd be willing to bet that this is a common error and most would not know what to do. I think this one deserves a place in the validation thread that is floating somewhere around here.
> Unfortunately, JavaScript is not designed to pass HTML compliance. The reason for this is JavaScript is not HTML. Therefore, your HTML code will still pass W3C compliance. The JavaScript on your page should not be tested for W3C compliance since this is only for HTML code.
Our response...
Unfortunately it is not the JavaScript that is failing HTML compliance. It was a <noscript> tag that had unescaped ampersands. We've corrected the issue and will see if it interferes with the tracking code. If it does not, then great, we may decide to promote *************. If it does interfere, then we will find a product that understands the need to write valid HTML and can provide us with error free HTML.
The JavaScript itself does not get checked when validating. I can't believe you would send such an amateur response to the problem. Maybe we'll publish this in our review of the product. I personally would have sought a solution and then responded with a fix.
> Hello,
I have looked into this issue and discussed it with the development team. You are correct, the characters in the <noscript> tag can be escaped and should be. I have filed a bug report on it, so that it can be remedied. I apologize for the inconvenience.
Thank you for choosing ********* ****.
********* Delivery Engineer II
Check if your server is c1. or c2. or c3.thecounter.com before using...
<!-- Start of TheCounter.com Code -->
<SCRIPT type="text/javascript" language="javascript1.2"><!--
s="na";c="na";j="na";f=""+escape(document.referrer)
//--></SCRIPT>
<SCRIPT type="text/javascript" language="javascript1.2"><!--
s=screen.width;v=navigator.appName
if (v != "Netscape") {c=screen.colorDepth}
else {c=screen.pixelDepth}
j=navigator.javaEnabled()
//--></SCRIPT>
<SCRIPT type="text/javascript" language="javascript1.2"><!--
function pr(n) {document.write(n,"\n");}
NS2Ch=0
if (navigator.appName == "Netscape" &&
navigator.appVersion.charAt(0) == "2") {NS2Ch=1}
if (NS2Ch == 0) {
r="&size="+s+"&colors="+c+"&referer="+f+"&java="+j+""
pr("<A HREF=\"http://www.TheCounter.com\" TARGET=\"_top\"><IMG")
pr("ALIGN=\"CENTER\" BORDER=\"0\" ALT=\"TheCounter\"")
pr("SRC=\"http://c1.thecounter.com/id=000000000"+r+"\"></"+"A>")}
//--></SCRIPT>
<NOSCRIPT><A HREF="http://www.TheCounter.com" TARGET="_top"><IMG
SRC="http://c1.thecounter.com/id=000000000" ALIGN="CENTER"
BORDER="0" ALT="TheCounter"></A></NOSCRIPT>
<!-- End of TheCounter.com Code -->
You will need to change the digits 000000000 to be your own ID number.
There are two separate places in the code where this has to be done.
The word CENTER can be changed to LEFT or RIGHT if you need a different
alignment of the image. There are two places that this needs to be done.
This is a list of the changes:
Old: <SCRIPT><!--
New: <SCRIPT type="text/javascript" language="javascript1.2"><!--
Old: <SCRIPT language="javascript1.2"><!--
New: <SCRIPT type="text/javascript" language="javascript1.2"><!--
Old: <SCRIPT><!--
New: <SCRIPT type="text/javascript" language="javascript1.2"><!--
Old: pr("BORDER=0 SRC=\"http://c1.thecounter.com/id=000000000"+r+"\"></A>")}
New: pr("ALIGN=\"CENTER\" BORDER=\"0\" ALT=\"TheCounter\"")
New: pr("SRC=\"http://c1.thecounter.com/id=000000000"+r+"\"></"+"A>")}
Old: SRC="http://c1.thecounter.com/id=000000000" BORDER=0></A>
New: SRC="http://c1.thecounter.com/id=000000000" ALIGN="CENTER"
Old: </NOSCRIPT>
New: BORDER="0" ALT="TheCounter"></A></NOSCRIPT>
This corrects most of the errors that occur if the code is submitted
to the W3C HTML Validator at: <http://validator.w3.org/>.
Yeah, this also threw up a <NOSCRIPT> problem sometimes.