Forum Moderators: open
On a multi asian language website, we have some encoding problems.
For exemple, when you receive a email in chinese, on some computer it will appear correctly, but on an other one it will be impossible to read (both computer has the chinese characters encoding implemented, both of them can browse chinese website).
DO you have any website that explain the characters encoding system. I really need to solve this problem. I can't send emailto customer if they can't understand it.
What is the better encoding system we can use (for 7 asian language - Thai, Chinese (sdimplified and traditional), Korean, Indonesian, japanese).
Thnaks a lot
Norcvi
UTF-8 (Unicode) covers most languages and if you create text in Unicode it can be read by a huge range of software...however
some countries hit the Internet in a big way before Unicode was widespread...so they have a lot of people using a different character encoding system for their language...relatively simple examples are shift_JIS in Japan and Windows-1251 in Russia...Chinese is more complicated since Taiwan and Hong Kong generally use different forms of character anyway (simplified and traditional) and developed two separate encoding systems (Big 5 and GB)...of course the mainland generally uses UTF-8 to complete the set
so...it gets complicated if you want 100% accessibility...you'll need to offer at least two versions of Japanese (though few people now require shift_JIS) and three versions of Chinese
when it comes to email you need to use the standard system for communicating across the language barrier...send in your own language with instruction on how to find an online automatic translation service unless you have staff who can communicate fluently in the relevant language
[edited by: tedster at 12:54 am (utc) on May 11, 2004]
[cs.tut.fi ]
[alanwood.net ]
I've read pretty extensively at Jukka Korpela and Alan Wood's site, but it can be such a complex issue, that more perspectives can't hurt.
One more link that I like
[fileformat.info...]
it has relatively detailed info on every unicode character and a pretty good search function. It doesn't really have any info that addresses the original poster's question, but I find it a handy resource.
Tom
basically you need to look up content negotiation...this will allow a visitor to be directed to a page according to their browser settings...however not everyone will want to read the site in the language they have set the browser to (eg if they are in an Internet cafe, visiting a client/supplier abroad and using their desktop etc)...so ONLY use content negotiation on index.html in each language and direct all internal links to a home page to default.html
use a language switching page that covers all the languages offered...on that you need the two letter language code, the type of encoding, the name of the language in that language and in English, and a short piece of descriptive text as spider food