| 8:27 am on Nov 4, 2010 (gmt 0)|
The UTF-8 meta tag you have in your HTML source may be overruled by a content type header sent by the web server software. Headers sent before the HTML source have higher precedence than meta tags. You can check this with a server header checker. There are tool websites which offer that functionality (WebmasterWorld has such a header checker in the subscription area for example), or you can use the Live HTTP Headers add-on for FireFox. The way to change this header differs depending on the server software (Apache, IIS, Nginx) you use.
| 11:21 pm on Nov 5, 2010 (gmt 0)|
Also - make sure that you save the actual html text file as utf-8. By default, most text editors save as Latin-1 or Mac Os Roman. You can change it in 'preferences' (It may be called something different in Windows) of your text editor. If you don't save the file as utf-8 then even with the declaration in your file header, it won't work.
| 8:33 am on Nov 10, 2010 (gmt 0)|
Thanks for your help guys. I'm implement these ideas when I'm next on this project and post my results.
| 3:32 pm on Nov 12, 2010 (gmt 0)|
No... Actually, Comannder - it worked! It had just converted the foreign characters to funny little things. Having re-inserted them it works a treat!
Thanks for the help!
One more question though, please.
What's the difference between with and without BOM (byte order mark). Which one shall I use?
| 4:45 pm on Nov 12, 2010 (gmt 0)|
The BOM (three unique bytes at the beginning of a file which define the file type) causes text editors to automatically switch to the right encoding setting when a file is loaded from disk. But I have seen some strange browser behaviour when serving pages with a BOM in it. For webserving purposes, it is therefore better to leave them out.
| 5:07 am on Nov 13, 2010 (gmt 0)|
Glad I could be of assistance -
UTF-8 does not require BOM and in some browsers it will give you a blank line or funny little things at the beginning of a document. So it's best not to use it. Save your files as UTF-8 with no BOM.
Read this - [w3.org ]
or this - [w3.org ]
and since you are working heavily with international alphabets, try to find the time to read all of this - [w3.org ]
If you are using Windows you may need to know that
|A particular protocol (e.g. Microsoft conventions for .txt files) may require use of the BOM on certain Unicode data streams, such as files. When you need to conform to such a protocol, use a BOM. |
from here [unicode.org ]
| 9:49 pm on Nov 14, 2010 (gmt 0)|
I would like to add that when you set your text editor to save as UTF-8, you also have to set set it to open this type of file. If you choose UTF-8 also, then your editor may not open Latin-1 or Mac OS Roman encoded files. In this case you get a 'cannot open file' warning.
Text Editor on OS X has the open option Automatic. This opens any file.
But then remember that your text editor will still save all files, and convert all files that you open, to UTF-8.
So if you suddenly have trouble opening files after changing both 'open as', and 'save as', settings in your text editor to UTF-8, just go in and change those settings back to default or play around with them. Usually you will just have to reset the text editor to open files as Latin-1 if you run across a file that won't open, since that is the more common default encoding. Then reset back to UTF-8 when you need to open the UTF-8 files again.
But I recommend using the setting automatic for open, and UTF-8 for save.
| 9:58 am on Nov 15, 2010 (gmt 0)|
Wow, thanks for such comprehensive replies peeps. Really appreciate it.
I have already begun reading through those pages, Commander. Thanks!
Really can't express my gratitude enough. It really had me stumped!