Forum Moderators: coopster
I fixed this at one level by converting the MySQL database to UTF8. This fixed it insofar as my computer is now diaplaying those characters properly but apparently some other computers are not. Some other computers are still showing wierd characters.
Does anyone know what the problem thus might still be? Are some client machines missing the fontmaps to express those characters? Or, are they not understanding to display that text as UTF8 perhaps?
Well-formed html/xhtml is also critical. This means your file must be correct. Use the W3C's validators to make sure your file is valid. The validator will check the encoding declared and tell you if it is actually correct. Just declaring UTF-8 in the meta tags doesn't make it so, the file itself has to be written and saved as UTF-8.
Now, one problem is that PHP doesn't use UTF-8 itself. So it is best to have it take the text from a file outside of the .php file and store that text in a variable.
It's more work in the beginning, especially for simple things, but infinitely more reusable and maintainable, because you are not mixing PHP and HTML/XHTML code in the same file. The receiving user-agent (browser) never knows the difference. The html/xhtml sources can be .txt files encoded as UTF-8 whose contents are used by PHP to create that dynamic web page.
If you don't do these things, results will be unpredictable, and will depend on the "user-agent" that reads and renders the page. (usually a web browser)
Thanks for the input all the same!
<form action="script.php" method="post" [b]accept-charset="UTF-8"[/b]>
...
</form> In IE in particular this will ensure that windows-1252-specific encodings such as curly quotes etc. from Word are transmitted (and therefore added to your database) encoded as UTF-8.
If you are running a Linux or other Unix-based server, you can use
iconv to convert your existing windows-1252 files and data to UTF-8. This, added to HTTP headers and meta elements specifying UTF-8 on every page (put the meta element before the title element!), should allow your application to run smoothly in UTF-8. As you are using PHP, you should at least use
iconv instead of a home-grown solution to converting the data. See: 1. UTF8 meta tag in the head of each page
2. accept-charset="UTF-8" in each form
3. converted the dB to UTF8
...now it is supporting special characters and glyphs etc from other languages ... but I am still getting those empty boxes in place of common MS Word formatted characters such as apostrophes, quotes, "..." chatacter and the extended dash "-".
Any thoughts on those? This stuff is driving me batty! I thought I had it for sure this time. :(
At this point however my problem does not appear to be input - the data looks fine in the dB. the problem is that these characters are not being represented properly on the output page. I'm just getting stupid squares/boxes where apostrophes and quotation marks should be.
If the MS-Word curly quotes work and Firefox says windows-1252 or ISO-8859-1, then those quotes are not UTF-8 encoded.
How did you convert the database? Did you specify ISO-8859-1 to UTF-8 or windows-1252 to UTF-8?
I've been trying to get a CMS system I built to support and properly show those many special characters and specially formatted dashes "..." characters, special apostrophes etc that come from MS Word. I thought I had it but its back with a vengence.
I've done the following to date:
1. UTF8 meta tag in the head of each page
2. accept-charset="UTF-8" in each form
3. converted the dB to UTF8
.... now, I am seeing the proper values in the dB but funny litle squares where those MS Word formatted characters should be.
Can someone please help?!?! Does anyone know what the heck to do?!?!?!?!?!?!?!?!?!?!
If I *REMOVE* the UTF8 declaration (meta tag on display page) then those MS Word characters show properly...but none of the glyphs or special characters work in that case. Conversely if I add back the UTF meta tag then just the opposite is true.
So it would appear I must choose between support of glyphs and support of MS formatted apostrophes, quotes, etc.
Is this correct? Am I missing somehting? Sigh.