Forum Moderators: open

Message Too Old, No Replies

Non-SGML Character

The W3 Validator doesn't like —

         

Russ Dollinger

1:25 am on Mar 14, 2002 (gmt 0)



The W3 validator is giving an error message about using
& #151 (actually minus the extra space) for an em-dash (—).

Error: reference to non-SGML character

What should I do? Does it matter? It seems to work on many different
browsers.

rjohara

4:18 am on Mar 14, 2002 (gmt 0)

10+ Year Member



An entity reference is supposed to end with a semicolon. Try & #151; and see if that works. I use lots of em dashes in the form & # 8212 ; which also works.

Russ Dollinger

4:52 am on Mar 14, 2002 (gmt 0)



& # 151 ; did not work.

& # 8212 ; did work.

Apparently the first works with most browsers since it is in the ISO-8859-1; it just isn't SGML.

Russ Dollinger

6:04 am on Mar 14, 2002 (gmt 0)



“ ‘ ’ ”

What about these illegal characters. In most fonts these look like the curly quote marks or smart quotes.

These work with IE and Netscape, but they don't validate.

rjohara

6:17 am on Mar 14, 2002 (gmt 0)

10+ Year Member



Yup, I use a lot of those too, especially because there's a & #8217 ; in my last name. :) (It's actually kind of annoying having to stick a decimal character entity into your name every time you type it.)

The best reference I have found for all the HTML character entities is:

[htmlhelp.com...]

which lists them by name, hex form, and decimal form, and shows how each displays in your browser. In general the decimal forms are the most widely supported, but I'm sure there are exceptions here and there.

leogah

8:49 pm on Mar 14, 2002 (gmt 0)



It's like this: & #151; refers to the Unicode character set. Always. It's only real bytes that can change meaning depending on the character encoding.

151 is a control character, thus the validator complains. Where did the dash come from then? Well, windows has an encoding where 151 is an em-dash. If you used that charset, then put a 151 byte in the data, it would be valid. Not a & #151; - an em-dash is always & #8212;

Browser support is widespread for & #151; because it's easier just to do what IE does, even if it is bogus.

Far more than you wanted to know about the issue at [cs.tut.fi...]

(edited by: tedster at 10:39 pm (utc) on Mar. 14, 2002)