Forum Moderators: coopster

Message Too Old, No Replies

mysql varchar(n) in utf8 format

number of characters not accurate?

         

louponne

1:01 pm on Jun 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have transfered a database to UTF8 format and have just realized that for the varchar(n) fields, the (n) is no longer accurate - that is, if I set a varchar(50), then the field will hold 50 "normal" characters, but if any of them are accented ones such as ιη, the field won't "hold" 50 characters.

Is this a bug in mysql?

eelixduppy

3:24 pm on Jun 28, 2006 (gmt 0)



Took some time to find this but this may be the problem: [bugs.mysql.com...]

louponne

4:12 pm on Jun 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks, but actually that bug refers to a change that is happening when a varchar() is set to a high number of characters. In my case, it's only 50.

I did search the mysql site but didn't find anything that might shed some light on this strangness.

?

eelixduppy

4:20 pm on Jun 28, 2006 (gmt 0)



"strangeness" indeed! I did however check this on my server and it works for me. This is why I thought it may be a setting issue. Best of luck!

louponne

7:17 am on Jun 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



for info, answered on another forum:


In versions before 4.1, the length spec is in bytes, not characters. Your "regular" characters are all one byte each, so the length works for both bytes and characters. Your "exotic" characters, though, take two bytes for each character, so they'll take up 2 length slots.

You will need to either upgrade your server to 4.1 or greater, or modify your length specs to allow for those multibyte characters.

le_gber

8:07 am on Jun 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



louponne what were the reason to change to UTF-8 (I assume you had a type latin-1 or something else.

I am working on a french site as well and I've included the HTML tag in my fields for each special character (', é, ç, ...) and then use the php utf8_encode/utf8_decode to parse the data back from the DB (for URL's only)?