Forum Moderators: coopster

Message Too Old, No Replies

Special chars & UTF-8: sometimes ok, sometimes wrong

text is taken from DataBase

         

guarriman

3:22 pm on Oct 18, 2007 (gmt 0)

10+ Year Member



Hi.

Working with PHP 4.4 and mySQL 4.1, I've got some texts stored in a UTF-8 table with special chars.

I serve a UTF-8 header within my HTML, Apache is configured to serve UTF-8 and PHP scripts are saved in UTF-8 charset.

However, sometimes I get 'Espaņa' and other times 'Espa�a'. The difference? I press F5 (Refresh) bottom on my web browser (I use Firefox and Internet Explorer).

Any similar experience?

guarriman

3:29 pm on Oct 18, 2007 (gmt 0)

10+ Year Member



This is the first time I experience this issue.

When I have suffered problems with special chars I use utf8_decode or utf8_encode, but I always try to store text data in UTF-8 charset and serve them always with UTF-8 PHP scripts.

However, this is a very odd issue, since it happens only with text taken from DataBase, but not from texts written in scripts :(

PHP_Chimp

3:36 pm on Oct 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is there <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> in the head of the document code?

I am guessing that it may well be an issue with the browser. As the default encoding of the browser is usually set for ISO-8859-1. My theory is that when you hit F5 you get the page from the cache using the default encoding from the browser, not using the content-type header supplied by php the first time around.
Have you tried changing the default encoding on your browser to see if it is a browser issue?

guarriman

3:51 pm on Oct 18, 2007 (gmt 0)

10+ Year Member



Hi PHP_Chimp. Thank you very much for your answer.

> Is there <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> in the head of the document code?
Yep, there's
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

> I am guessing that it may well be an issue with the browser. As the default encoding of the browser is usually set for ISO-8859-1. My theory is that when you hit F5 you get the page from the cache using the default encoding from the browser, not using the content-type header supplied by php the first time around.
My web browser is in English and it's set to "Unicode (UTF-8)"

And if your theory is right, why do all the UTF-8 webpages of the Internet work? Web browsers would change encoding randomly.

> Have you tried changing the default encoding on your browser to see if it is a browser issue?
I changed it to Western (Latin-1) and it won't work :(

PHP_Chimp

4:12 pm on Oct 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My theory was based on the browser cached versions of pages not the original page served. As if you didnt have a content-type specified on the page then php would add its type to the headers sent. However browsers dont seem to be very good at actually storing the original headers sent. So if no meta-equivalent content-type is specified then a cached version may well be encoded using the default encoding set on the browser.
However this isnt the problem in your case, as you have proved.

guarriman

4:13 pm on Oct 18, 2007 (gmt 0)

10+ Year Member



this fixes the issue:


mysql_query("SET CHARACTER SET utf8");
mysql_query("SET NAMES utf8");