Forum Moderators: coopster

Message Too Old, No Replies

Problems with Spanish Characters in MySQL

When inserting spanish characters "á, ñ.." it does not insert right

         

percy05

10:19 pm on Feb 1, 2006 (gmt 0)

10+ Year Member



Hello friends, here is a little problem:
When I insert spanish characters "ñ, á...") in a MySQL table through command line it does it correctly, but when I insert them through a HTML-Php form it does not insert correctly. I have read that the solution could be in CHARACTER SET, but i did all what the Dev.MySQL online manual says but the problem still continues. Here is useful information:

Server: Apache 2.0.53
MySQL 4.1
PHP 4.3.10
OS Win98
IE 6.0

The scripts are very simple and short and the runs in localhost, one simple pc:

In my.ini the defaul-character set was: latin1

1) I created a table in MySQL via command line:

mysql> CREATE TABLE gato2 (nombre VARCHAR(20), apodo(20)) CHARACTER SET utf8 COLLATE utf8_spanish_ci;

Supossely with this I am putting character set "utf8" -the best and universal character set- to all the table, independently of the default character set that appears in my.ini

When I insert spanish characters "ñ, á...") in the table through command line it does it correctly, but when I insert them through a HTML-Php form it does not insert correctly. For example: "Raúl Año", will store as "Ra·l A±o".

2) The HTML Form script "form1.htm" is:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
<TITLE><Form1</TITLE>
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
</HEAD>
<BODY>
<form action="http://localhost/reg2.php" method="post">
Nombre: <input type="text" name="nombre" size="30"></br>
Apellido Paterno: <input type="text" name="apodo" size="30"></br>
<input type="submit" value="Enviar!" size="30">
</form>
</BODY>
</HTML>

3) And the script of PHP File "reg2.php" is:

<?php
header("Content-Type: text/html; charset=utf-8");
//Aqui se conesta con el servidor y la base de datos
include 'cox.php';
include 'bd_meusers.php';
//Aqui se define las variables del formulario que pasare a la BD con este .Php
$nombre=$_POST['nombre'];
$apodo=$_POST['apodo'];
//Y aqui introduzco estos datos en la base de datos
$sql = mysql_query("INSERT INTO gato2 (nombre, apodo) VALUES('$nombre', '$apodo')") or die(mysql_error());
?>

--------------

That is all! It is rare that it does not work if the scripts are simple and I am following the steps of the MySQL nd forum steps. Finally I changed in my.ini the default-character-set from latin1 to utf8:

default-character-set=utf8

but the problem does not dissapear: MySQL does not recognize spanish letter "ñ" and accents á, é...

What is the problem, with the scripts or configuration please? I think the problem is in PHP, because I can insert correctly those special characters when I do with command lines... I have tried and read a lot and also your: [webmasterworld.com...]
and ask many people, but there is no solution by now. It seems it is a common problem in Spanish programmation... Can anyone help with this please?

Thank you very much!

Percy

StupidScript

12:24 am on Feb 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Percy,

Depending on how the data is ultimately used, you may be able to get by with:

$nombre=htmlentities($_POST['nombre']);

$apodo=htmlentities($_POST['apodo']);

This is useful if, for example, you are storing user names and such in a db then using that data to display things on a web page, like the name stored in the db.

percy05

6:15 am on Feb 2, 2006 (gmt 0)

10+ Year Member



Hello and thanks, I will translate the comments of the PHP file that are written in Spanish (although they don't alter the sense of this simple script), maybe it helps to understand better the problem:

********
3) And the script of PHP File "reg2.php" is:

<?php
header("Content-Type: text/html; charset=utf-8");
//Here database and server are connected
include 'cox.php';
include 'bd_meusers.php';
//Here I will define form variables that will be inserted to MySQL database-table through this PHP file
$nombre=$_POST['nombre'];
$apodo=$_POST['apodo'];
//And here I insert this data in the database
$sql = mysql_query("INSERT INTO gato2 (nombre, apodo) VALUES('$nombre', '$apodo')") or die(mysql_error());
?>

*******
I hope this incompatibility of Spanish special characters may have solution, thanks a lot!
*******

Hi StupidScript and thank you! I have done what you suggested: write HTMLENTITIES in the $_POST:

$nombre=htmlentities($_POST['nombre']);
$apodo=htmlentities($_POST['apodo']);

but the result in the MySQL table is still incorrect:
When I insert "Raúl Año" in the html form these are the results shown in the MySQL table:

PHP without HTMLENTITIES: "Ra·l A±o"
PHP without HTMLENTITIES: "Ra&Atilde;ordm;l A&Atilde;&plusmn;o"

Thank you very much! Hope to get your comment!

Percy

Dijkgraaf

6:34 am on Feb 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, they will be encoded like that, but try writing them directly out to an HTML page, and they will look correct, which is why StupidScript recomended it.

tomda

6:46 am on Feb 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah; Stupidscript is right...
You should use htmlentities, datas are stored in the database this way. In your HTML page, either you'll leave like this and it should properly show the character (depending of the charset you are using - see below), either you use html_entity_decode() to decode the stored datas.

Have some reading
[us2.php.net...]

Note that

htmlentities ( string string [, int quote_style [, string charset]] )
it takes an optional third argument charset which defines character set used in conversion. [...] Presently, the ISO-8859-1 character set is used as the default.

percy05

7:13 am on Feb 2, 2006 (gmt 0)

10+ Year Member



Thank you Dijkgraaf and Omda, you mean with HTMLENTITIES the data inserted in a form will display correctly it in html files, in despite of not showing correctly it in a MySQL database. But please, what happens if I need that the data inserted in the table muss be shown like it was inserted, with its own special characters? Names, addresses, etc... Is it possible to store that way in MySQL?
Thank you very much!
Percy

tomda

7:20 am on Feb 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I need that the data inserted in the table muss be shown like it was inserted
NO NO! Don't reinvent the wheel...

What if a user type the following in your form

my'name"
Then, this will definitely mess up your SQL query unless you change the ' and the " (which are special characters)...

htmlentities() is a must! Use it...
If you want to do without it, then you'll definitely have problems (safety, special characters, etc...)

Yeah, data will be storeb with special characters encrypted.
FORM -> CHECK DATA -> ENCRYPT DATA -> INSERT SQL
RETRIEVE SQL -> DECRYPT DATA -> SHOW DATA

percy05

9:36 am on Feb 2, 2006 (gmt 0)

10+ Year Member



Thank you Tomda, you are right and I agree now with the use of Htmlentities, but please, is there not definitively any way to have a MySQL table or database with spanish characters, as they were inserted? I ask it because it is necessary to have accents, letters "ñ" and so in the databases, to read them correctly because if not, it will be very hard to read and understand the data inserted by the users in the databases...
You say so:
RETRIEVE SQL -> DECRYPT DATA -> SHOW DATA
That's good. Please how can I do to show them right? (spanish accent letters and "ñ" "Ñ")
Thank you very much!
Percy

StupidScript

5:57 pm on Feb 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Percy,

It looks like you don't want to use an HTML page to view the contents of the database.

You want some users to be able to view the data directly IN the database, and that's why you are worried about the entity encoding, right?

This is a tough one, because the special characters you are talking about are, indeed, special.

In order for any type of database (except a text file) to handle special characters, they must be encoded in some fashion, either as a unicode string or as something else. When you are using a program like, say, Microsoft Access, it hides the encoding from you, and does the decoding 'on-the-fly' when you are viewing data from within the program. If you were able to see the actual data, as you are with MySQL and others, you would see the encoding instead of the user-readable decoded version the program show you by default.

I have two comments and one suggestion.

First comment, from a security and data protection standpoint, it is a good idea to have very few users actually manipulating the database directly. Within the database there are very few checks for corrupted data or opportunities to keep big mistakes from happening, like deleting too much data or even an entire table by accident.

Second comment, it is not very efficient to work directly with the table data when it is not hard to create an interface to it.

My suggestion is to go ahead and use

htmlentities()
and (if you're running PHP 5+)
html_entity_decode()
to manage ALL user input, regardless of whether it might contain special characters. This will help protect your database from bad data, and handle the proper encoding for your application.

Build an interface to the database using PHP, and set up a user and group permissions system to make sure that only the users you want are manipulating the data. In this interface, the PHP will be handling the encoding and decoding, so the safely encoded data in the database will not be a problem for you or your users.

It's more work, but it's the right thing to do.

percy05

8:02 pm on Feb 2, 2006 (gmt 0)

10+ Year Member



Thanks a lot StupidScript, very interesting and useful what you say... Please excuse me, I am a newbie, do you know how to use decoding, or, please, some little steps? - Another question, please: is it possible to export data with special characters to from a table to a .txt file with "Tee commamd"? Of course it works with english characters, but with Spanish ones? For exameple, to make an e-marketing campaign, etc...
Thank you very much
Percy

StupidScript

9:34 pm on Feb 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Percy,

Here is the basic idea for encoding and decoding. I'm using separate pages for each step to keep the explanation simple, but I would usually do everything on one page using conditional statements.

1: The page with the form for entering data
This is a normal web page with a normal FORM element. The FORM's 'action' points to the processing page.

<form action='processform.php' method=post>

2: The form processing and database entry page
This page (a) checks for form data coming in under the POST method, (b) encodes the form data and (c) enters the encoded data into the database.

if (is_set($_POST['nombre'])) {

$nombre=htmlentities($_POST['nombre']);

$dbIn=mysql_query("insert into db values ... etc.

You probably want to redirect the user to a confirmation page or something after a successful db dump.

3: The database interface page.
First, establish a login function using any method you like. That task is more than I want to include in this example, but it's not hard and many threads here can help you.

Second, get the data from the database and decode it for use in an email message. It is not necessary to decode the data if you are going to simply echo it in HTML page output, because the encoding works just fine in a web browser.

$dbOut=mysql_query("select nombre from db where ...");

while($row=mysql_fetch_array($dbOut)) {

$nombre=html_entity_decode($row["nombre"]);
etc.

Then comes the mail function ... also found in many threads in this forum.

I hope that helps you understand the process:

1: Data posted
2: Data encoded and entered into db
3: Data retrieved from db and decoded

percy05

1:34 pm on Feb 3, 2006 (gmt 0)

10+ Year Member



Hi StupidScript
Thank you very much, I will do what you say. But I have a doubt please: So, what is the real function of CHARACTER SET and ALLOCATE utf8, latin1, etc, if they don't work when storing special characters as spanish accents, letters ñ and so in a DAtabase or Table?
Thank you very much
Percy

StupidScript

6:05 pm on Feb 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The PHP language options are for PHP. You are integrating with another program ... MySQL ... that has its own way of dealing with character sets. As we have been talking about, we use the PHP methods for the PHP stuff and try to work with MySQL in a way that PHP understands. After all, PHP is doing the reading and writing, and MySQL is simply the storage place.

percy05

6:46 pm on Feb 4, 2006 (gmt 0)

10+ Year Member



Thank you very much StupidScript!
So, I muss understand that I will never see the spanish special characters like accents, "Ñ", etc in my MySQL database if they were inserted through a Form/Feedback, right? - And what about if it is necessary to export, set outpout, them with "TEE" command like "tee mysql.out" or "tee log.txt" or another file format? Because HTMLENTITIES and decoding are just for PHP/HTML... You know sometimes names ann addresses are needed to marketing, send paper letters, etc...
Thanks a lot again,
Percy

dmmh

7:25 pm on Feb 4, 2006 (gmt 0)

10+ Year Member



then you just write another script to decode the values first and then output the decoded stream to a file ......

percy05

8:19 pm on Feb 4, 2006 (gmt 0)

10+ Year Member



Thanks Dmmh!
Excuse me, but please do you know how to do it?
Thanks a lot
Percy