Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: bill
Maybe I've had too much Christmas cheer...but it seems that you have a GB2312 page and perhaps too much UTF8 data.
GB2312 is the recommended format for Chinese sites. It's the most compatible with the widest variety of Chinese browsers.
You can get convertors which do this, typically by pasting the original text into the convertor and then copying out the result. I use NJStar Word Processor to creat the text and this has a facility for copying directly into GB2312 or Unified-simplified. (Plus also Big5 and Unicode-traditional.)
One point to note is that Chinese uses double-byte characters. So if a single byte is missing from the start of the text the result can look like Chinese but is actually garbage.
First, thanks to everyone who replied but a special big thank you to those who took the time to read and think about what my real problem was and offer a practical solution.
I put this on several sites and got a lot of help.
Actually HarryM on this board was on the right track.
It was from his comments that I finally found the answer.
At the risk of going on I will detail what happened in case anyone else has the problem in the future.
The original English was typed in Gedit, Linux using Charset 8859-1
The Chinese was typed in Chinese, on Chinese Word using native Chinese Win Xp O/S, then transferred to my via flash disk.
The Chinese site Charset was originally UFT 8
When it became apparent that the majority of users could not read the Chinese I switched the code to GB2312 but that did not cure the problem.
From here a number of Chinese designers offered help, but could not solve the problem, some even rewrote the text again, and again transferred it to me.
The closest was a suggestion that I needed to put an instruction in the meta code that the body text was GB2312, but as to how, no one was sure.
This is actually the key, but I missed that sign post.
In the end the solution was to use "SAVE AS" when editing and select the GB2312 code, easy in Linux.
In Win 2003 server it was suggested to use Star WP, which proved a no go for me because when i tried to convert the text became a series of? making editing impossible and it also uploaded to my site via FTP as?
Not the look I wanted.
In the end I downloaded a free to trial [sorry!] copy of EC character encoding software which was ridiculously simple and effective.
I recommend this and will buy it if I ever have to do another Chinese site and need to use Windows.
The only hassle, if you can call it that is that when it is converted, either by Linux or EC converter, the scribble LOOKS Chinese but is actually garbage, on the editing block, BUT, miraculously, displays as real Putongwha on the site.
So, any subsequent editing means a reconvert back to UTF or big 5, then convert back to GB2312 before uploading. There may, surely must be, a better way but I was just happy after 6 weeks to have found a solution.
Maybe someone far more clever than me can add to this for future users.
What did puzzle me for a while was why did neither the original Chines typed script, produced on a Chinese machine and the copy made by a Chinese designer, [which I forgot to say ran faultlessly on his site when he trial it] collapse when I ran it up on mine.
The only thing I can think of is that at the time of saving to flash disk it was fine, but during the loading to my English O/S it was converted to big 5 and from here it all went wrong.
Again, this is just a guess.
So, the answer is that as well as having Charset gb2312 stated in the Meta code, one also must have the text body typed in GB2312 as well, otherwise, chaos.
And don't try to to it at 3 AM after 6 weeks of hassle when the mind is fuzzy and convert EVERTHING on the page to GB2312 as i first did, then wonder why the bloody page is in Chinese OK but won't display [can I say the B... word?]
The HTML code doesn't like to be set in GB2312!
Once I worked that out it was, relatively, plain sailing.
So again, cheers to all.
Happy New Year!