Forum Moderators: open

Message Too Old, No Replies

Charsets and spanish characters

         

Mike521

3:33 pm on Jul 15, 2008 (gmt 0)

10+ Year Member



I'm hoping someone can point me to a good resource about charsets and displaying spanish characters. Here's my problem:

I have pages that display using iso-8859-1. There's a form on those pages that sends us an email, and customers sometimes use spanish characters in the form.

I found that if I send the form with a tag saying utf-8, the spanish characters turn to gibberish, or they disappear completely. If I send the form using iso-8859-1 they appear fine.

My question is, I thought utf-8 included iso-8859-1, therefore if you need iso-8859-1 you can just say utf-8 and have more flexibility?

I had a similar problem on normal pages on our site, which are also encoded iso-8859-1. I tried switching them to utf-8 and many characters (not just spanish) went bad.

obviously I'm missing something but I'm not sure what it is.

encyclo

4:52 pm on Jul 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My question is, I thought utf-8 included iso-8859-1, therefore if you need iso-8859-1 you can just say utf-8 and have more flexibility?

Not quite - UTF-8 and ISO-8859-1 both include US-ASCII, but they encode extended (non-ASCII) characters differently (single-byte in ISO-8859-1 and double-byte in UTF-8). So they are not directly interchangeable.

I wrote this a while back which explains some of the differences:

[webmasterworld.com...]

Mike521

5:13 pm on Jul 15, 2008 (gmt 0)

10+ Year Member



thanks encyclo, that other post you wrote was really helpful.

Can I go through our process and you let me know if I understand it correctly?

1. someone visits our contact-us.php page, which is ISO-8859-1
2. they fill it out using ALT keys for the spanish characters
3. they submit it to a PHP script
4. PHP script puts the data in an email which is also ISO-8859-1
5. If I tried to make the email UTF-8, it would break because the original data the user entered came in as ISO-8859-1?
6. If I insisted on making the email UTF-8, I have to make contact-us.php UTF-8 also, and then nothing would break?

encyclo

10:19 am on Jul 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your process looks fine to me. If you're dealing exclusively wish Spanish or western European languages, then you can simply stick with ISO-8859-1. If you want to move to UTF-8, you need to convert the pages with a tool such as iconv or similar, in particular the page with the form. Bear in mind the caveats of PHPs handling of UTF-8 - you can use UTF-8 within the content, but it's not safe to use anything other than ASCII within the code itself (ie. variables and such).

Mike521

1:21 pm on Jul 16, 2008 (gmt 0)

10+ Year Member



I gotcha, thanks again! I'll just stick with ISO-8859-1 for now since the whole site is that way already

thanks for all the help