Page is a not externally linkable
- WebmasterWorld
-- Webmaster General
---- web in mobile devices


lucy24 - 2:49 am on Jul 29, 2011 (gmt 0)


As I said the database is in latin1_swedish_ci and handle the characters well.

Is that a programming language? It sounds like a file encoding.

Just tried a program to convert the files to utf-8 and did not work,

What did it do? That is, in what way did it not work, and how could you tell?

How do I change it to utf-8 except changing the charset?

The "charset" declaration in an html file does not change anything. The only thing it does is tell the user's browser how to interpret any given byte or group of bytes.

You know of course that what travels across the Internet is not abcde... It is just a string of 01001100 et cetera. And then the device at the far end takes those 0's and 1's, puts them into sets of eight, and changes them into displayed characters. The same thing happens at the "near" end when you are typing your text. (Note also that it makes no difference what you did to make the character in the first place: "dead keys", alt + a string of letters, a Swedish keyboard, et cetera. That's another and completely unrelated issue.)

when I write cañon guía, in the email I get cañon guía.

Long version: ñ in UTF-8 is four bytes, C3 B1. But the device at the other end doesn't know this. If it thinks the text is supposed to be Latin-1, then your ñ becomes two separate characters, C3 and B1, giving you ñ. (The first letter of the pair will always be à because this whole block starts with C3. Another block I'm very familiar with, the UCAS range, always turns into á followed by two random letters, because the letters are three sets of two bytes, starting with E1.)

So it does not work with the form, dont know why, and on top I have information that I get from a database that are in latin1_swedish_ci, so the ñ I get as a figure with a ? inside

That question mark is the UTF-8 "I can't deal with this" replacement character. You get it when your original text was in Latin-1 (or other 1-byte encodings including Mac Roman), which uses codepoints that UTF-8 doesn't use. That includes your ñ, along with letters like ä ö and å that must crop up sooner or later ;)

Is she ever going to get to the point?

Uhm. The point is that if the HTML "charset" declaration says UTF-8, then you have to make your original raw text into UTF-8. Exactly how you do this will depend on your text editor or html editor. There might be a popup or menu item somewhere, or something you change in the Preferences. In extreme cases you might even have to read the manual.


Thread source:: http://www.webmasterworld.com/webmaster/4344217.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com