Welcome to WebmasterWorld Guest from 54.211.86.24

Forum Moderators: incrediBILL

encoding for thai and vietnamese

how to set the charset, or go to unicode?

   
10:11 pm on Jan 18, 2004 (gmt 0)

10+ Year Member



researching a bit for thai and vietnamese language sites, should i stick with for e.g. thai
<meta http-equiv="Content-Type" content="text/html; charset=windows-874">

or go to unicode/UTF8?

Your thoughts?

9:28 am on Jan 19, 2004 (gmt 0)

10+ Year Member



I would use UTF-8 for these languages.
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
10:43 pm on Jan 19, 2004 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



A lot of Thai websites use:

<meta http-equiv="Content-Type" content="text/html; charset=TIS-620">

Some use:

<meta http-equiv="Content-Type" content="text/html; charset=windows-874">

2:19 am on Jan 20, 2004 (gmt 0)

WebmasterWorld Administrator bill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I know that a lot of developers in Japan and China still shy away from Unicode for page encoding. I wouldn't be surprised if other Asian language character sets had similar problems with it. Although Unicode may seem like a panacea, there are still a number of perceived problems with it. Do a survey of some of the leading sites in those languages and see what they use.
8:53 am on Jan 20, 2004 (gmt 0)

10+ Year Member



On the other hand, if you have pages in several languages, it's much easier to use UTF-8 on ALL pages. I have tested it in more than 60 languages and it works perfectly.
9:11 am on Jan 20, 2004 (gmt 0)

WebmasterWorld Administrator bill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You've tested 60 languages with local operating systems and browsers? Do those include Thai and Vietnamese? I'm not saying that UTF-8 isn't great for a number of languages...it's just that I've heard reports of certain Asian languages where it hasn't worked. Make sure you test with local webmasters who may know about potential problems.
12:48 pm on Jan 20, 2004 (gmt 0)

10+ Year Member



Yes. Thai and Vietnamese are amongst them, and they display correctly.

Note: you must save all documents as Unicode documents, or it won't work...

4:17 am on Jan 21, 2004 (gmt 0)

WebmasterWorld Administrator bill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Could I ask what browsers and which versions you tested with and in combination with what operating systems? IE has always been quite good with Unicode display. It's Netscape and some others that were problems from what I recall.

You would probably want to find a breakdown of what the most popular browsers (and versions) were for those respective language markets and consider those factors as well. Then of course it may depend on your niche market's users...but you all know that.

9:58 am on Jan 21, 2004 (gmt 0)

10+ Year Member



bill, all modern browsers have a built-in Unicode support, so that's no problem. All operating systems support Unicode, so that's no problem either.

You'll find more information on this site: [alanwood.net...]

The major reason why we use UTF-8 on all our pages is this: when you have a mix of several languages (for example Vietnamese, Chinese, Korean and English) on the same Web page, all characters display properly.

That's why all (smart) translation bureaus use UTF-8.

6:24 am on Jan 22, 2004 (gmt 0)

WebmasterWorld Administrator bill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



tombola don't get me wrong...I think Unicode is a great idea. I've been waiting for years for it to work out the kinks on the Asian language side...I'm waiting for someone to prove to me that it's 100% ready. ;)

Take a look at this article: A peek at Unicode's soft underbelly [www-106.ibm.com]
This came up in a recent discussion [webmasterworld.com] (msg#9) we had over in the Asia Pacific Forum. I get wary when people tout UTF-8 as the solution to encoding problems because I've heard a lot to the contrary. I'm really just playing devil's advocate here waiting for some of the old Unicode pros to show themselves.

9:45 am on Jan 22, 2004 (gmt 0)

10+ Year Member



ok bill, I rest my case.

... but I'll stick to UTF-8 ;-)

10:36 am on Jan 22, 2004 (gmt 0)

10+ Year Member



thai-language.com which is apparently a resource for learning Thai suggests using Unicode for Thai.
ht*p://www.thai-language.com/default.asp?tab=5

Vietnamese is ~Latin a-z + alsorts of accents and diacritic marks which Unicode handles easily. That it is isn't a more complicated graphic script suggests Unicode will probably be the best for Viet also.

Of course the best way would be to ask a local Thai and Viet users who might be able to advise the prefered system in those countries.

 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month