homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

UTF8 with Chinese and Russian

5+ Year Member

Msg#: 4225976 posted 2:14 pm on Nov 3, 2010 (gmt 0)

I'm developing a multi-lingual website at the moment that has Russian and Chinese versions of the copy. However, I can't get my browser to render their fonts.

I can see Chinese and Russian on other websites but not the one I've created. I simply get a series of question marks.

I have changed my charset to UTF 8 but to no avail... Am I doing something wrong here? Here's a snippet of my code:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<title>Lektronix - CHINESE TEXT HERE</title>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<meta http-equiv="Content-Language" content="CN">
<meta name="author" content="Tom Cash">
<link rel="shortcut Icon" href="../icon.ico" type="image/x-icon">
<link href="../-shared/css/stylesheet.css" rel="stylesheet" type="text/css">

Thanks in advance,



WebmasterWorld Senior Member lammert us a WebmasterWorld Top Contributor of All Time 5+ Year Member

Msg#: 4225976 posted 8:27 am on Nov 4, 2010 (gmt 0)

The UTF-8 meta tag you have in your HTML source may be overruled by a content type header sent by the web server software. Headers sent before the HTML source have higher precedence than meta tags. You can check this with a server header checker. There are tool websites which offer that functionality (WebmasterWorld has such a header checker in the subscription area for example), or you can use the Live HTTP Headers add-on for FireFox. The way to change this header differs depending on the server software (Apache, IIS, Nginx) you use.


5+ Year Member

Msg#: 4225976 posted 11:21 pm on Nov 5, 2010 (gmt 0)

Also - make sure that you save the actual html text file as utf-8. By default, most text editors save as Latin-1 or Mac Os Roman. You can change it in 'preferences' (It may be called something different in Windows) of your text editor. If you don't save the file as utf-8 then even with the declaration in your file header, it won't work.


5+ Year Member

Msg#: 4225976 posted 8:33 am on Nov 10, 2010 (gmt 0)

Thanks for your help guys. I'm implement these ideas when I'm next on this project and post my results.

Much appreciated.


5+ Year Member

Msg#: 4225976 posted 3:32 pm on Nov 12, 2010 (gmt 0)

No... Actually, Comannder - it worked! It had just converted the foreign characters to funny little things. Having re-inserted them it works a treat!

Thanks for the help!

One more question though, please.

What's the difference between with and without BOM (byte order mark). Which one shall I use?


WebmasterWorld Senior Member lammert us a WebmasterWorld Top Contributor of All Time 5+ Year Member

Msg#: 4225976 posted 4:45 pm on Nov 12, 2010 (gmt 0)

The BOM (three unique bytes at the beginning of a file which define the file type) causes text editors to automatically switch to the right encoding setting when a file is loaded from disk. But I have seen some strange browser behaviour when serving pages with a BOM in it. For webserving purposes, it is therefore better to leave them out.


5+ Year Member

Msg#: 4225976 posted 5:07 am on Nov 13, 2010 (gmt 0)

Glad I could be of assistance -

UTF-8 does not require BOM and in some browsers it will give you a blank line or funny little things at the beginning of a document. So it's best not to use it. Save your files as UTF-8 with no BOM.

Read this - [w3.org ]

or this - [w3.org ]

and since you are working heavily with international alphabets, try to find the time to read all of this - [w3.org ]

If you are using Windows you may need to know that
A particular protocol (e.g. Microsoft conventions for .txt files) may require use of the BOM on certain Unicode data streams, such as files. When you need to conform to such a protocol, use a BOM.

from here [unicode.org ]


5+ Year Member

Msg#: 4225976 posted 9:49 pm on Nov 14, 2010 (gmt 0)

I would like to add that when you set your text editor to save as UTF-8, you also have to set set it to open this type of file. If you choose UTF-8 also, then your editor may not open Latin-1 or Mac OS Roman encoded files. In this case you get a 'cannot open file' warning.

Text Editor on OS X has the open option Automatic. This opens any file.
But then remember that your text editor will still save all files, and convert all files that you open, to UTF-8.

So if you suddenly have trouble opening files after changing both 'open as', and 'save as', settings in your text editor to UTF-8, just go in and change those settings back to default or play around with them. Usually you will just have to reset the text editor to open files as Latin-1 if you run across a file that won't open, since that is the more common default encoding. Then reset back to UTF-8 when you need to open the UTF-8 files again.

But I recommend using the setting automatic for open, and UTF-8 for save.


5+ Year Member

Msg#: 4225976 posted 9:58 am on Nov 15, 2010 (gmt 0)

Wow, thanks for such comprehensive replies peeps. Really appreciate it.

I have already begun reading through those pages, Commander. Thanks!

Really can't express my gratitude enough. It really had me stumped!

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved