Forum Moderators: open
I have wondered about the same thing and have been meaning to experiment with the XML functions native to PHP. Have tried just typing whatever text you want without entities, but in ISO-8859-1, and then using utf8_encode()? Here's the man page:
utf8_encode
(PHP 3>= 3.0.6, PHP 4 >= 4.0.0)
utf8_encode -- encodes an ISO-8859-1 string to UTF-8
Description
string utf8_encode ( string data)
This function encodes the string data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this:
Table 1. UTF-8 encoding
bytes bits representation
1 7 0bbbbbbb
2 11 110bbbbb 10bbbbbb
3 16 1110bbbb 10bbbbbb 10bbbbbb
4 21 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb
Each b represents a bit that can be used to store character data.