I should add this happens on each page I've tried - no matter the content. I know the html is correct.
Is it something to do with my host? My ftp software (although never seen this before)? My domain registrant?
None of them are chinese!
First, TOS prohibit personal urls, however, the source code of the page home.html looked like this (abbreviated)...
The site url returns a directory list. To avoid this, you need a file with a name such as index.html, but other names are acceptable depending on server configuration.
Your server is returning this in its response header:
Content-Type:text/html (BOM UTF-16, litte-endian)
You might want to be using UTF-8 as a character encoding, but I've no idea what BOM UTF-16 is.
Seeing as your server is based in the United States (according to IP address geolocation), I've no idea why it would be sending such an odd Content-Type header.
Check what character encoding you're using in your HTML editor. If it doesn't look like that's to blame, then ask your server admin why the web server is specifying such an odd encoding.
I've contacted my host but they are pretty useless. Is it my host or registrant that needs to deal with this?
The only editor I use is notepad.
Would it help if I added this UTF-8 code in somewhere - sorry quite ignorant on this!
Here's the line I use.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Put that just after the <head> statement and before the <title>
Note the semicolon you are missing. Just cut and paste from this posting.
I don't know where you got that other content type, I would remove it as a test. -Larry
Thanks for all the advice
I tried putting in that code but after asking my host why it isn't showing up they replaced it with:
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
His reponse was:
The problem is in the headers of your page, you are encoding the page as UTF-8, most of the
UTF-8 characters are Chinese.
I have changed the encoding of the page to iso-8859-1 it now functions correctly.
Which works - however I've tried it on other pages and they are still showing Chinese symbols - is this simply a problem with my host. I'm on a 14 day trial and am not impressed so far. Would you consider trying someone else?
BOM UTF-16 suggests you might have a Byte Order Mark on the front of your file.
You need to save the files UTF-8 without BOM and to do this you need an editor that gives that option.
Several editors are listed on Alan Wood’s Unicode Resources - Multilingual Editors [alanwood.net]. I'd recommend EmEditor having used it..
Some editors save as "UTF"=UTF-16 and not UTF-8. Wordpad on Windows 98 is one that does that.
I expect then data within the file is read as 16byte characters many of which are Chinese.
Alternatively, if you're not using characters outside iso-8859-1 you could simply save all your files as ASCII or iso-8859-1.
I use notepad. I've never had this problem before - why now?
Check the encoding you are saving as. Notepad offers several flavours ANSI, Unicode, Unicode big endian and UTF-8. Maybe the default changed from ANSI to Unicode big endian?
It is not a problem with your host, it is a problem with your source code.
I don't see what the fuss about? change every page to reflect the correct encoding and thats it done.
If you ask me the host has been kind enough to locate the problem and tell you exactly where you have gone wrong.
Thanks for all your help. Got it now - was saving the file in notepad incorrectly.