Forum Moderators: open

Message Too Old, No Replies

UTF-8 vs. HTML Entities

With UTF-8 encoded web page HTML Entities are useless?

         

jonte

8:45 pm on Dec 13, 2005 (gmt 0)

10+ Year Member



Greetings all.

While including Korean, Japanese, Russian and Chinese in my web page (and hence changing charset encoding to UTF-8)

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

I noted that the W3C Markup Validation Service (for XHTML 1.0 Strict) stops complaining about transforming characters (such as é, ü, å, etc.) to HTML entities.

I am guessing the validator does it's job, I'm just afraid that some browsers may not show all characters as I intend them to be shown.

What should I do? Stop converting to HTML entities? Convert only "normal" strings to HTML entities? Only use UTF-8 encoding on pages where I truly need it?

Thank you for any thoughts.
Jon

asquithea

9:01 pm on Dec 13, 2005 (gmt 0)

10+ Year Member



Personally, when using UTF-8 encoding, I wouldn't bother to use entities to represent printable characters without a good reason. Obviously they can be used independently of the character encoding, but why clutter your source unecessarily?

They do have their uses -- I recently used &#8203; to get a zero width space, for example -- but you shouldn't need them much these days.

encyclo

2:04 am on Dec 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What should I do? Stop converting to HTML entities?

Absolutely, if your pages are correctly encoded in UTF-8 you should have no need for HTML entities - just use the characters you want directly. This is one of the huge advantages of UTF-8. Browser support is very widespread these days (only IE4/NN4 and below are problematic, and they're ancient).

I use UTF-8 by default in almost all circumstances, even if the pages are in English, and I avoid HTML entities as much as possible.