Forum Moderators: coopster
Is there a better approach to this? I'm stuck and would appreciate some guidance, Thanks! :)
There are some really great tutorials out there about handling extended characters. It's a complicated subject... It'll take a bit of reading, but you should learn about UTF-8, how it works, how to design a system that uses it effectively. Do some searches for "UTF-8" and learn the basics.
Then once you have absorbed what UTF-8 is, there is an entire chapter devoted to using it effectively in the O'Reilly book "Building Scalable Web Sites" by Cal Henderson.
A rule of thumb is: don't store escaped stuff in your database. Unless you're deliberately "denormalizing" or "pre-processing" data for performance reasons, the data should be stored in its rawest, unencoded, unescaped, nakedest form possible. That means, you look in the database and you should see Chinese characters, not 请 stuff.
Then when you're preparing/rendering data for output, that's when you do htmlencoding, escaping, etc., as required. For instance, if your data is being output in XML, there's a lot of encoding and escaping that needs to be done.
Getting user-entered data into the database "raw" is tricky, and it's where built-in methods like PHP's mysql_real_escape_string() comes in handy.