Forum Moderators: open
I have pages that display using iso-8859-1. There's a form on those pages that sends us an email, and customers sometimes use spanish characters in the form.
I found that if I send the form with a tag saying utf-8, the spanish characters turn to gibberish, or they disappear completely. If I send the form using iso-8859-1 they appear fine.
My question is, I thought utf-8 included iso-8859-1, therefore if you need iso-8859-1 you can just say utf-8 and have more flexibility?
I had a similar problem on normal pages on our site, which are also encoded iso-8859-1. I tried switching them to utf-8 and many characters (not just spanish) went bad.
obviously I'm missing something but I'm not sure what it is.
My question is, I thought utf-8 included iso-8859-1, therefore if you need iso-8859-1 you can just say utf-8 and have more flexibility?
Not quite - UTF-8 and ISO-8859-1 both include US-ASCII, but they encode extended (non-ASCII) characters differently (single-byte in ISO-8859-1 and double-byte in UTF-8). So they are not directly interchangeable.
I wrote this a while back which explains some of the differences:
[webmasterworld.com...]
Can I go through our process and you let me know if I understand it correctly?
1. someone visits our contact-us.php page, which is ISO-8859-1
2. they fill it out using ALT keys for the spanish characters
3. they submit it to a PHP script
4. PHP script puts the data in an email which is also ISO-8859-1
5. If I tried to make the email UTF-8, it would break because the original data the user entered came in as ISO-8859-1?
6. If I insisted on making the email UTF-8, I have to make contact-us.php UTF-8 also, and then nothing would break?