homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

Encoding Characters and Multi-Language Websites

 7:54 am on May 10, 2004 (gmt 0)

Hi All,

On a multi asian language website, we have some encoding problems.

For exemple, when you receive a email in chinese, on some computer it will appear correctly, but on an other one it will be impossible to read (both computer has the chinese characters encoding implemented, both of them can browse chinese website).

DO you have any website that explain the characters encoding system. I really need to solve this problem. I can't send emailto customer if they can't understand it.

What is the better encoding system we can use (for 7 asian language - Thai, Chinese (sdimplified and traditional), Korean, Indonesian, japanese).

Thnaks a lot




 2:50 pm on May 10, 2004 (gmt 0)

Which character set are you using when displaying the page?
Are the characters in unicode?


 3:55 pm on May 10, 2004 (gmt 0)

UTF-8 Unicode should work in your case.

We have no problem with Chinese (all versions) Japanese and Korean. No idea about thai and Indonesian however given the above fact that should be no problem.


 4:36 pm on May 10, 2004 (gmt 0)

UTF-8 only works if the characters are actually in Unicode, though.


 11:43 pm on May 10, 2004 (gmt 0)

OK...this gets complicated

UTF-8 (Unicode) covers most languages and if you create text in Unicode it can be read by a huge range of software...however

some countries hit the Internet in a big way before Unicode was widespread...so they have a lot of people using a different character encoding system for their language...relatively simple examples are shift_JIS in Japan and Windows-1251 in Russia...Chinese is more complicated since Taiwan and Hong Kong generally use different forms of character anyway (simplified and traditional) and developed two separate encoding systems (Big 5 and GB)...of course the mainland generally uses UTF-8 to complete the set

so...it gets complicated if you want 100% accessibility...you'll need to offer at least two versions of Japanese (though few people now require shift_JIS) and three versions of Chinese

when it comes to email you need to use the standard system for communicating across the language barrier...send in your own language with instruction on how to find an online automatic translation service unless you have staff who can communicate fluently in the relevant language

[edited by: tedster at 12:54 am (utc) on May 11, 2004]


 3:06 pm on May 11, 2004 (gmt 0)

some further useful sites with information on character encoding and multi-lingual web sites

[cs.tut.fi ]

[alanwood.net ]

[i18ngurus.com ]

[webreference.com ]


 5:05 pm on May 11, 2004 (gmt 0)

That's a nice set of references, Eric. I was immdiately able to fill in some gaps in my knowledge.


 2:28 pm on May 12, 2004 (gmt 0)

I'm currently setting up a site to cover pretty much everything I've learned about i18n...it'll be up in a few weeks time


 7:04 pm on May 12, 2004 (gmt 0)

I hope the moderators will make an exception to let Eric post the URL to his site once he has it running, even if it might be considered self-promotion.

I've read pretty extensively at Jukka Korpela and Alan Wood's site, but it can be such a complex issue, that more perspectives can't hurt.

One more link that I like


it has relatively detailed info on every unicode character and a pretty good search function. It doesn't really have any info that addresses the original poster's question, but I find it a handy resource.



 10:05 pm on May 12, 2004 (gmt 0)

I'll set it as the site in my user profile, that's the simple answer


 2:32 pm on May 13, 2004 (gmt 0)

Thanks. Anything to help avoid getting those funny looking characters to show up.

Actually, I think I can do "international" (read: anglo-euro) pretty well. I still see a lot of sites, though where they obviusly have no clue that things will look wrong if they don't get the encoding right.


 4:31 am on May 14, 2004 (gmt 0)


Thanks, I will have a look on it, but three versions in chinese, 2 in japanese, etc.... will be really complicated. we have a lot of content!



 4:53 am on May 14, 2004 (gmt 0)


Now, if I want to use this kind of system, to have a 100% compatible website, with all the asian languages, is there any way to select automatically the right encoding system per user? Or the user will have to select by himself the right site version.




 8:34 pm on May 14, 2004 (gmt 0)

I've just posted an explanation of my favoured system in the Asian Search Engines Forum

basically you need to look up content negotiation...this will allow a visitor to be directed to a page according to their browser settings...however not everyone will want to read the site in the language they have set the browser to (eg if they are in an Internet cafe, visiting a client/supplier abroad and using their desktop etc)...so ONLY use content negotiation on index.html in each language and direct all internal links to a home page to default.html

use a language switching page that covers all the languages offered...on that you need the two letter language code, the type of encoding, the name of the language in that language and in English, and a short piece of descriptive text as spider food

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved