Forum Moderators: open

Message Too Old, No Replies

Character encoding for a website in Chinese

UTF-8 or GB2312?

         

sabine

12:32 pm on Mar 18, 2006 (gmt 0)

10+ Year Member



I am developping a Chinese (mandarin) version of a site.
I got it working with the charset of UTF-8. The choice for UTF-8 was, the client had delivered the text in that encoding. I know there is an other one the charset of gb2312 which is also used by a lot of sites among them Google on IE.
Could anyone who has experience workong with them, give me an advice, which of them is better.

In the moment I am coding the pages with Notepad. The only editor at my disposal, which allows me to import text with a peticular encoding to save the pages with a UTF-8 or other encoding. It there a better editor?

Thanks for any help.

encyclo

1:35 am on Mar 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm a big fan of UTF-8 for everything, but the expert opinion is that for Chinese it is often better to stick to GB2312 (or Big5 for Traditional Chinese). Here are a few earlier threads on the topic:

  • UTF-8 or GB / Big5 encodings? [webmasterworld.com]
  • What do I need to do to create a Chinese website? [webmasterworld.com]

    As for the editor question, Notepad can have problems when producing UTF-8 encoded content for the web - it has a tendency to add a byte-orger mark (BOM) to the page which can confuse browsers. Most web-oriented modern text editors (Homesite, Dreamweaver...) handle UTF-8 correctly.

  • redstorm

    1:09 pm on Mar 20, 2006 (gmt 0)

    10+ Year Member



    I suggest you to choose gb2312 which applies to most websites in Mainland China.

    sabine

    3:55 pm on Mar 26, 2006 (gmt 0)

    10+ Year Member



    excuse me for coming late back to the thread I have started. First I like to thank everyone for the offered help.
    I understand that gb2312 or Big5 is a saver way to encode for mainland China. Is it still the situation for the 5.0 generation of browsers? (The site will not be developped for older browsers)

    My final questions, and sorry if they seam stupid: Word encodes the Chinese text as UTP-8. Is there a way to save it in Word 2000 and later as Big5 or gb2312? If not, is producing the Chinese text in TextEditor an alternative?

    thanks Sabine

    leunga

    3:08 pm on Mar 30, 2006 (gmt 0)

    10+ Year Member



    Hi sabine,
    Just in case Word can't do, try FrontPage's page properties to change text encoding.
    leunga

    sabine

    10:15 pm on Apr 2, 2006 (gmt 0)

    10+ Year Member



    Again, thanks to everyone who help me out. The site is online by now. Although there are arguements against, I have chosen the UTF-8 encoding. It is a bussines site which does not need to be that backwards compatible.The future will show, if I have to change that.
    TextEdit was a great editor for a site like this, because of the option, not to add the BOM (Byte Order Mark)for UTF-8. Many thanks for pointing me to that editor.
    So at the end I could get the pages validated against the
    W3-validator.
    There is one question left: what is the equivilent to Verdana in the Chinese letter-fonts. Is it SimSun?

    59ideas

    7:04 pm on Apr 8, 2006 (gmt 0)

    10+ Year Member



    Most modern browser do not have a problem if it is encoded in utf-8 or gb2312. Reading the text should not be a problem as a chinese user can see both encoding.

    However you might want to take note if you are developing in a server that is english.

    check your apache (or webserver) if it is sending out the right char-set header.

    Also if you are receiving input, check your database storage char-set.

    If you are sending out email, check your email char-set header.

    bill

    4:52 am on Apr 10, 2006 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    There is one question left: what is the equivalent to Verdana in the Chinese letter-fonts. Is it SimSun?

    On my Chinese and Japanese sites I have stayed away from specifying font names altogether. I may enhance text with font size, italics or bold, but not fonts themselves. I use fonts in Chinese graphics, but not on the web page's text for the most part.

    Do any of you bother declaring fonts on Asian language sites? (Maybe I'm just the lazy one.)

    leunga

    10:46 am on Apr 10, 2006 (gmt 0)

    10+ Year Member



    Hi Bill, I don't care fonts in chinese websites as well. It appears to me that declaration is very often useless becuase the client browser often lacks the font's support. So, I think most people don't set and would allow the broswer to adopt defaults.
    leunga