Forum Moderators: open
Back in the days of the Netscape browser I had some real problems with older versions being able to handle EUC-JP encoding. None of the modern PC or mobile browsers seem to have issues with EUC-JP that I've heard of.
And I even remember that, after I left a Japanese webagency, the new webmasters quickly changed the encoding of the site to EUC-JP arguing that Shift-JIS can be buggy on mySQL database (is it true?) and that I was very incompetent for using shift-JIS.
And it looks like the majority of Japanese sites use EUC-JP now.
As to 'better' or 'worse' I think this is going to be like whether Fuji-san is better or worse than Mt Everest: it all depends on what you mean! B^>
Rgds
Damon
the majority of sites use EUC-JP instead of shift-JIS.
Unfortunately from what I've heard in the industry UTF-8 is still more problematic than either EUC-JP or Shift-JIS. (That goes for Chinese encoding as well.) There are character display issues with PHP and MySQL for instance that are the bane of developers of Japanese sites. I'm still looking forward to the day when Unicode will truly be the best encoding solution. They're heading in the right direction.
From what I read, I think I'll choose the following option :
- use UTF-8 for western languages
- use shift-JIS for Japanese
What do you think?
Also, I have another question:
I have developed my own multilingual CMS that use the dedicated ISO encoding for every language. All the PHP files are in ANSI mode, the texts for the interface are taken from flat text files, the websites content is taken from mySQL databases. I've heard that it's necessary to convert the PHP files into UTF-8 for the encoding to work. Is that true? Even if the PHP files contain only code? And I have to convert the flat text files and the mySQL tables, right?
Sorry but I'm getting very confused with all those encoding issues :(
And the winner is:
So the final tally is as follows:
[edited by: encyclo at 7:25 pm (utc) on July 1, 2007]
So the final tally is as follows:
- UTF-8 = 12
- Shift_JIS = 8
- EUC-JP = 5
I got a little lazy there. Thanks for keeping me on my toes. ;) I always tell people to check out your excellent thread: Character encoding, entity references and UTF-8 [webmasterworld.com]
I think I'll wait for a stable PHP6 before moving toward UTF-8.
What do you think?
A long time ago I used to create websites in Japanese just using shift_JIS & English pages with the normal Western ICO encode. The backlinks from the English pages that were more popular had no affect on the Japanese pages. One day I changed everything to UTF-8 the rankings for all the Japanese pages went up. This was about 3 years ago but I think it might still apply.
I have mostly problem free experiences with utf-8 for Japanese sites.
The only problems I have had is when moving MySQL databases to different servers. Sometimes all hell broke loose but I think it was from me not setting up the database import correctly for utf-8 on the new server.
I have more problems with email encoding than website encoding....
You're using MBstring, right?
All the problem I got were from using PHP functions that don't support multibyte encodings like trim() for example.
Also, it can happen that a problem arises only on a certain Kanji which makes it very difficult to spot.
I don't get why all the PHP packages weren't provided with MBstring already included. There are too many shared environments that don't have MBstring.
Just out of curiosity: what makes it so difficult to use UTF-8 on emails?
I want to know what the best practice is for storing both ISO and JIS strings in a unique mysql column from a website that uses either ISO or JIS encodings.
Should the mysql server be set to a specific charset?
Should I specify a charset when I store the data?
Thanks a lot for helping.