Forum Moderators: open

Message Too Old, No Replies

Well obviously my HTML source code is a mess!

After using one of those HTML validators

         

tvldeals

3:48 pm on Oct 14, 2003 (gmt 0)

10+ Year Member



Ok I am not an HTML expert, far from it, but I used an HTML Validator site which returned an insane amount of coding errors. It suggested that due to this I would have a lower ranking or some search engines not listing me at all. Is my coding really that bad? I was wondering if there was a FREE program out there that would automatically correct my html errors, removing entries that should not be there? So far all I can find are the one that point them out. Well needless to say I am feeling overwhelmed. Maybe some of you can look at my source code and tell me what you think. My info is listed under profile. Problem is there are so many errors that they are referring to, I am completely lost at this point. I created this in Notepad...and using an HTML guide book but apparently I really messed up here big time. HELP! ;0(

mattur

4:13 pm on Oct 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Having markup errors may prevent SEs indexing your pages, but generally SE bots are good at handling non-valid html (by neccessity!).

To validate your code you could use the w3c's validator, or a free program like "CSE HTML Validator Lite".

To fix code automagically you could use "html tidy" which is very powerful, but may take a bit of time to get used to. HTH

Mohamed_E

4:40 pm on Oct 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



tvldeals,

My personal rule of thumb is that if the code is good enough for the common browsers to display it is good enough for the search engines to index.

We do not do site reviews, and in any case the URL posted in the profiles of new members are not displayed for others. But if you post the first few lines of diagnostic output we may be able to help you get started.

One guess: Do many of the messages say something like "&XYZ invalid attribute"? If you have alot of URLs with ampersands in them they can really mess up the validator output. The problem is that "&forum", to give an exapmle, is invalid HTML, the correct form is "&forum".

if you have a hundred URLs each with thre ampersand you will get 300 error messages :)

AWildman

4:45 pm on Oct 14, 2003 (gmt 0)

10+ Year Member



Check for basic things like making sure all tags have an open and a closing tag if necessary and that if there are nested tags that you've closed the inner ones before the outer ones. The validator on HomeSite is adequate to do the job and HomeSite is also nice for HTML editing.

Staffa

4:58 pm on Oct 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



TagCheck from Tafweb is very good and includes Tidy and it's free.

dragonlady7

7:50 pm on Oct 14, 2003 (gmt 0)

10+ Year Member



And-- one thing to remember! Often if you fix one initial error, a whole bunch of following ones will magically disappear. Why? Because their errors are because of the first one. For example, you have a tag you forgot to close. Well, the entirety of the rest of your page is wrong because none of your page content belongs inside that accidentally-open tag. So, start at the beginning and work down and you'd be amazed-- the errors disappear so fast.
I'm no great shakes at hardcore coding stuff, but just using a basic editor with line numbers, and the W3C's validator (which uses line numbers), I got through a series of 100 errors in almost no time. It was even kinda fun.

So, I wouldn't worry-- just take a deep breath and start from the beginning. It's not as bad as it looks, and you learn a *lot*.

tvldeals

8:30 pm on Oct 14, 2003 (gmt 0)

10+ Year Member



Ok after using Tidy this is what displays at the top (the corrected file)
<html>
<head>
<meta name="generator" content=
"HTML Tidy for Linux/x86 (vers 1st November 2002), see www.w3.org"

is this suppose to be my valid DOCTYPE? One validator said it looke liked HTML Prioretary ( this was Tidy) and another said DOCTYPE HTML 4.01 Transitional. I am so confused what it is and do all my pages that I create have to say the same generator at top?
Gosh I wish someone could take a look at my source code. Went from 394 WARNINGS to only 92 now by Tidying it up, but also said "1" error as well. What do the warnings mean. should I give them as much weight?

martinibuster

8:47 pm on Oct 14, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Try using htmlkit to spot errors and suggest corrections. It's a port of tidy, but it's not going to insert the generator meta, etc (yuck).

Also, do not- I repeat: Do Not let the software autocorrect the html for you. It can create a worse problem.

Lastly, create a backup of your website and place it in another folder, somewhere out of harm's reach.

Mohamed_E

8:54 pm on Oct 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Gosh I wish someone could take a look at my source code.

We have rules here for a variety of very valid reasons. And now that your questions are getting more specific we can start answering them!

The generator tag just tells whoever cares (nobody, in most cases) what software generated the page. I am pretty sure that there is a switch in Tidy to disable outputting a generator tag, but since it is harmles ...

The DOCTYPE is something that you must put in. Some validators will take a guess, and 4.01 Transitional is what they often guess. It is probably appropriate in your case.

I would put it in explicitly, before the <html>tag:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

<html lang="en">
<head>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

<link rel="StyleSheet" href="main.css" type="text/css">

<title>Your Title</title>

<meta name="keywords" content="your keywords go here">

<meta name="description" content="Description of content of this page">

</head>

<body>

You are doing quite well, down to 100 warnings from 400!

If you show us the first few warnings we will probably be able to help you figure out what is wrong. Did you check my suggestion about the entities and ampersands?

tvldeals

11:22 pm on Oct 14, 2003 (gmt 0)

10+ Year Member



I'm baaaaack.... ;0)

Thanks for everyones help. But oh my goodness, could this get anymore confusing?....I sware I do not have <meta> in the body elements or <head> as part of my body element: Here are just a few of the warnings-

1 Warning: HTML DOCTYPE doesn't match content
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2 Warning: inserting missing 'title' element
<HTML>
4 Warning: <head> isn't allowed in <body> elements
<head>
5 Warning: <meta> isn't allowed in <body> elements; Warning: <link> isn't allowed in <body> elements
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <link rel="StyleSheet" href="main.css" type="text/css">
6 Warning: <meta> isn't allowed in <body> elements
<meta name="revisit-after" content="7 days">
7 Warning: <title> isn't allowed in <body> elements
<title>PriceBusterHotels -
9 Warning: <meta> isn't allowed in <body> elements
<meta name="keywords" content=
11 Warning: <meta> isn't allowed in <body> elements
<meta name="description" content=
13 Warning: <meta> isn't allowed in <body> elements
<meta name="Classification" content=
15 Warning: <meta> isn't allowed in <body> elements
<meta name="rating" content="general">
16 Warning: <meta> isn't allowed in <body> elements
<meta name="MSSmartTagsPreventParsing" content="TRUE">
17 Warning: <meta> isn't allowed in <body> elements
<meta name="distribution" content="Global">
18 Warning: <meta> isn't allowed in <body> elements
<meta name="robots" content="all=INDEX,FOLLOW">
19 Warning: <link> isn't allowed in <body> elements
<link rel="stylesheet" type="text/css" href=
21 Warning: <bgsound> is not approved by W3C; Warning: <bgsound> isn't allowed in <body> elements
<bgsound src="Sinatra,Frank_NewYorkNewYork.mid" loop="1">
22 Warning: </head> isn't allowed in <body> elements
</head>
23 Warning: <body> isn't allowed in <body> elements
<body>
-----------------------------------------------------------
OK....here is what I have in my sourcecode, although I have eliminated the keywords, title, description for space here"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<html lang="en">
<head>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <link rel="StyleSheet" href="main.css" type="text/css">
<meta name="revisit-after" content="7 days">
<title>

<meta name="keywords" content=" >
<meta description=" >
<meta name="Classification" content= >
<meta name="rating" content="general">
<meta name="MSSmartTagsPreventParsing" content="TRUE">
<meta name="distribution" content="Global">
<meta name="robots" content="all=INDEX,FOLLOW">
<link rel="stylesheet" type="text/css" href=
"http:// ******/include/Chris/Hosted/templateimages/template/travelnowstyle2.html">
<bgsound src="Sinatra,Frank_NewYorkNewYork.mid" loop="1">
</head>
<body>
<table width="100%" border="0" cellpadding="0" cellspacing="5">
<tr>
SO WHAT AM I DOING WRONG HERE! YIKES! ;0{

Mohamed_E

12:24 am on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As I suspected, we're getting there, tvldeals!

I put your lines of text into a file and tried to validate with tidy.

First off, you seem to have two <html> tags, one in upper case and one in lower.

Then the fragment has a <title> but no </title>, adding it gets us down to three error messages.

You have two link commands, your original one plus one copied from my suggestion, the latter has the "malformed URI". I removed it.

Some of your meta tags are malformed, I have corrected them.

With all that the only problem is the bgsound, a tag that I know nothing about. I removed it.

The following code validates:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<META http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

<link rel="StyleSheet" href="main.css" type="text/css">

<meta name="revisit-after" content="7 days">
<title>Some Title</title>

<meta name="keywords" content="junk" >
<meta name="description" content="junk">
<meta name="Classification" content="junk" >
<meta name="rating" content="general">
<meta name="MSSmartTagsPreventParsing" content="TRUE">
<meta name="distribution" content="Global">
<meta name="robots" content="all=INDEX,FOLLOW">
</head>
<body>
<p> some stuff.
</body>
</html>

This lives in a file called (my apologies ;) ) junk.html. Running tidy on it gives:

% tidy -e junk.html
Info: Doctype given is "-//W3C//DTD HTML 4.01 Transitional//EN"
Info: Document content looks like HTML 4.01 Transitional
No warnings or errors were found.

I am a little dubious about the syntax of the robots meta with its two '=' signs, but the validator is not complaining. I suggest that you check the syntax.

This should allow you to get a good start on fixing the bugs.

tvldeals

3:11 am on Oct 15, 2003 (gmt 0)

10+ Year Member



Hi Mohamed,
Thanks again for all your help. I ran my header area thru on Tidy with the copy you supplied, but inserting my info and after I made the changes, it came back OK. So I was quite relieved..then proceeded to go to LinkScan which is at elsop.com to run it through there...Although I ran this on my index page, how in the world could it be this bad? Even after using Tidy on my whole site? Can someone explain as well if warnings are more imporant to correct than errors and I am not sure I understand all this talk about HTML syntax errors, if it is all that important in regards to SE listing and finding you. Seems there are alot of variable results between the different validators out there, so who do you trust?


Summary
Unknown: 0
Error: 22
Possible Error: 0
Warning: 11
Advisory: 0
No Error: 107
HTML Syntax Error: 240

Document Access: HTTP
Link Status: Real Time
Elapsed Time: 20
CPU Time: 3.98

tvldeals

3:27 am on Oct 15, 2003 (gmt 0)

10+ Year Member



As if this wasn't interesting enough.. I guess I should also paste in here the comments elsop.com's validator had to say. This is only one of a few...keeping in mind after I had "Tidyed" it up no less!

00001 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
00002 <html lang="en">
00002 unknown attribute "LANG" for element <html>.
00003 <head>
00004 <META http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
00005 <link rel="StyleSheet" href="main.css" type="text/css">
00005 illegal value for TYPE attribute of link (text/css)
00005 Error: 404 Not Found main.css <Link>
00006 <meta name="revisit-after" content="7 days">
00007
00008 <title>

martinibuster

3:47 am on Oct 15, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Seems there are alot of variable results between the different validators out there, so who do you trust?

Trust the source of the html standards, the W3C (world wide web consortium). The W3C is the first and last word on html standards.

Forget that other site and go straight to the source where you can either upload a file or point to a url:

[validator.w3.org ]

tvldeals

4:33 am on Oct 15, 2003 (gmt 0)

10+ Year Member



Oh boy, thanks Martin....I went there entered in my URL and there are still alot of errors, the main point it made in CAPS and in bold was-
"This page is NOT a Valid HTML 4.01 Transitional!"
Now what? Then what the heck is it then? I am so distressed about this....

reuben101

4:49 am on Oct 15, 2003 (gmt 0)

10+ Year Member



the main point it made in CAPS and in bold was-
"This page is NOT a Valid HTML 4.01 Transitional!"

Don't sweat the error message. It is as glaring if you forget an alt tag as it is if you run the phone book through it. Mohamed_E has got you really close, just have a beer, take a deep breath and look for a few unclosed tags and extra punctionation and you'll have it licked!
W3C should tell you what is wrong so it will be fairly simple to clean up.

Mohamed_E

1:04 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Valid Validators

In addition to the W3C validator I use three others:

  • HTML-Tidy. I am not sure exactly what its official status is, but it is featured on the front page of W3C.org, and at the bottom of the relevant page it says:

    Dave works on assignment to the World Wide Web Consortium, where he is the W3C lead for Voice and Multimodal.

    It is my main validator, since it can be used from the command line and hence incorporated in scripts.

  • The WDG Validator [htmlhelp.com] which will follow links and validate a whole site (with a limit of 100 pags).

  • The related shareware product A Real Validator which works on Windows machines. I have twice downloaded a free 30 day evaluation copy, but each time have decided that it really adds nothing to my toolkit.

There is a brief but very informative note on the WGD site on the difference between real validators and linters [htmlhelp.com], with emphasis on the use of a formal SGML parser for the former.

The output

00005 <link rel="StyleSheet" href="main.css" type="text/css">
00005 illegal value for TYPE attribute of link (text/css)
00005 Error: 404 Not Found main.css <Link>

shows confusion over the role of a validator. The statement

href="main.css"

is perfectly valid HTML even if the file is not (currently) there!

tvldeals

3:06 pm on Oct 15, 2003 (gmt 0)

10+ Year Member



I think I am feeling a bit better about my coding errors and warnings at this point. I decided to type some URLs into both Tidy and one other validator, most were very prominent successful websites with alot of traffic and some not so big. Ironically, over half of the these had twice as many warnings and errors as I came up with for mine and suggested that the HTML 4.01 Transitional (as most were) was not a valid DOCTYPE...so what does that suggest? If the coding problems are so important to fix why do these sites rank so well?

Mohamed_E

3:59 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In my first response to your post I wrote:

My personal rule of thumb is that if the code is good enough for the common browsers to display it is good enough for the search engines to index.

Why, then, do I validate my HTML? I am not really sure, but it has a lot to with my pride as a craftsman. I understand HTML, I know how to write valid HTML, so why should I produce sloppy stuff?

I never suggested that you should validate your code, but since you seemed to want to do so I tried to give you some pointers.

There is a thread on the HTML forum where theer is a suggestion that 99% of the stuff on the web fails to validate. So you are obviously in "good" company (or at least have many companions :) ).