Forum Moderators: open

Message Too Old, No Replies

Character set?

         

Emperor

5:28 am on Oct 29, 2004 (gmt 0)

10+ Year Member



Hi guys,

I've been using the following tags for my site, my site is in English:

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />

<meta http-equiv="content-language" content="en-us" />

I don't know much about character sets except that in C++ Unicode uses two bytes per character as opposed to one byte for ASCII, and that everything new (Microsoft SDKs) seem to be using Unicode for everything.

So what should I use for my web sites? Thanks.

Emperor

tedster

7:52 pm on Oct 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What character set you use depends mostly on the language of your document. Your current character set, iso-8859-1, is a group of "Western European" characters that includes us-ascii's 128 characters plus an extended set of special use characters that are not on the standard keyboard, like trademark and so on.

English language browsers usually default to this charset if you don't explicitly declare the charset in a meta tag - but declaring it is still a good idea. Someone, somewhere may have an easier time viewing your pages because you did.

The iso-8859-2 charset includes the "Central European" characters - I recently learned about it while creating a website in Polish. And there are many more, especially when you get into Asian languages.

Also the big bear is UTF-8, or Unicode Transformation Format-8. It is an octet (8-bit) encoding of Unicode characters. Unicode may well be the future of the web - it is the default encoding for XML and it includes as a subset all the us-ascii characters in a single octet.

UTF-8 is a whole country to explore. If you want to learn about it, one place to begin is [utf-8.com...]

However, if you are creating sites in European languages, then iso-8859-1 and iso-8859-2 should serve you quite well for quite some time.