homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

encoding for thai and vietnamese
how to set the charset, or go to unicode?

 10:11 pm on Jan 18, 2004 (gmt 0)

researching a bit for thai and vietnamese language sites, should i stick with for e.g. thai
<meta http-equiv="Content-Type" content="text/html; charset=windows-874">

or go to unicode/UTF8?

Your thoughts?



 9:28 am on Jan 19, 2004 (gmt 0)

I would use UTF-8 for these languages.
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">


 10:43 pm on Jan 19, 2004 (gmt 0)

A lot of Thai websites use:

<meta http-equiv="Content-Type" content="text/html; charset=TIS-620">

Some use:

<meta http-equiv="Content-Type" content="text/html; charset=windows-874">


 2:19 am on Jan 20, 2004 (gmt 0)

I know that a lot of developers in Japan and China still shy away from Unicode for page encoding. I wouldn't be surprised if other Asian language character sets had similar problems with it. Although Unicode may seem like a panacea, there are still a number of perceived problems with it. Do a survey of some of the leading sites in those languages and see what they use.


 8:53 am on Jan 20, 2004 (gmt 0)

On the other hand, if you have pages in several languages, it's much easier to use UTF-8 on ALL pages. I have tested it in more than 60 languages and it works perfectly.


 9:11 am on Jan 20, 2004 (gmt 0)

You've tested 60 languages with local operating systems and browsers? Do those include Thai and Vietnamese? I'm not saying that UTF-8 isn't great for a number of languages...it's just that I've heard reports of certain Asian languages where it hasn't worked. Make sure you test with local webmasters who may know about potential problems.


 12:48 pm on Jan 20, 2004 (gmt 0)

Yes. Thai and Vietnamese are amongst them, and they display correctly.

Note: you must save all documents as Unicode documents, or it won't work...


 4:17 am on Jan 21, 2004 (gmt 0)

Could I ask what browsers and which versions you tested with and in combination with what operating systems? IE has always been quite good with Unicode display. It's Netscape and some others that were problems from what I recall.

You would probably want to find a breakdown of what the most popular browsers (and versions) were for those respective language markets and consider those factors as well. Then of course it may depend on your niche market's users...but you all know that.


 9:58 am on Jan 21, 2004 (gmt 0)

bill, all modern browsers have a built-in Unicode support, so that's no problem. All operating systems support Unicode, so that's no problem either.

You'll find more information on this site: [alanwood.net...]

The major reason why we use UTF-8 on all our pages is this: when you have a mix of several languages (for example Vietnamese, Chinese, Korean and English) on the same Web page, all characters display properly.

That's why all (smart) translation bureaus use UTF-8.


 6:24 am on Jan 22, 2004 (gmt 0)

tombola don't get me wrong...I think Unicode is a great idea. I've been waiting for years for it to work out the kinks on the Asian language side...I'm waiting for someone to prove to me that it's 100% ready. ;)

Take a look at this article: A peek at Unicode's soft underbelly [www-106.ibm.com]
This came up in a recent discussion [webmasterworld.com] (msg#9) we had over in the Asia Pacific Forum. I get wary when people tout UTF-8 as the solution to encoding problems because I've heard a lot to the contrary. I'm really just playing devil's advocate here waiting for some of the old Unicode pros to show themselves.


 9:45 am on Jan 22, 2004 (gmt 0)

ok bill, I rest my case.

... but I'll stick to UTF-8 ;-)


 10:36 am on Jan 22, 2004 (gmt 0)

thai-language.com which is apparently a resource for learning Thai suggests using Unicode for Thai.

Vietnamese is ~Latin a-z + alsorts of accents and diacritic marks which Unicode handles easily. That it is isn't a more complicated graphic script suggests Unicode will probably be the best for Viet also.

Of course the best way would be to ask a local Thai and Viet users who might be able to advise the prefered system in those countries.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved