Forum Moderators: coopster

Message Too Old, No Replies

Php file conversion from Ansi to UTF8

I got a big problem in IE when doing the conversion

         

asantos

12:30 am on Apr 28, 2006 (gmt 0)

10+ Year Member



i just converted all my website files from ANSI to UTF8 (im developing on windows). Everything renders ok in Firefox, but i have a small problem in IE.

This is the code example:

************
<?
//index.php:
echo 'my name is ';
include('inc.php');
?>
************

************
<?
//inc.php:
echo 'andres';
?>
************

I should get:
my name is andres

But in IE I get this:
my name is []&#65279;andres

The brackets [] aren't brackets per se, they are a strange square character that gets inserted on each include (only in IE). Since I've got some includes before the doctype, all my layout gets screwed up. How can I fix this? Im running apache 2.0.55 with Php 4.4.0 on winxp.

Thanks.

asantos

12:31 am on Apr 28, 2006 (gmt 0)

10+ Year Member



BTW: The &#65279; shows up like a square character in windows.

encyclo

12:58 am on Apr 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



&#65279;
is a zero-width non-breaking space (
0xFEFF
in hex). When you say you converted from ANSI to UTF-8, how exactly did you convert the files? Did you "Save as Unicode" in Notepad or something similar?

If so, your files may actually be in UTF-16 rather than UTF-8, and in any case the file contents are preceeded by a Byte-Order Mark (BOM) which is causing all the trouble.

Here are a few recent threads about the BOM:

[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]

I'm not very familiar with Windows tools for authoring UTF-8, but most web-orientated text editors (not Notepad) handle UTF-8 properly. Removing the BOM can be difficult as the characters are zero-width thus invisible, you may need to resort to a hex editor. If you have access to a Linux or *nix machine, the

iconv
utility usually handles conversion between different character encoding flawlessly.

asantos

1:17 am on Apr 28, 2006 (gmt 0)

10+ Year Member



AActually i used Notepad2 to convert the files. Thats an all around tool that lets you convert file encoding from ansi to utf8 and utf8 with signature.

do you know any other tool for windows that actually converts to utf8 removing that buggy character (BOM)?

asantos

1:27 am on Apr 28, 2006 (gmt 0)

10+ Year Member



ill use textpad as recommended. thanx for the tip.