Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Difference between utf8 general ci and utf8 bin

10:42 am on Aug 18, 2007 (gmt 0)

5+ Year Member

Hi there,

does anyone know the difference between these two types of mysql table collations:

utf8_general_ci and utf8_bin

Is one better than the other and if yes, can you explain why?


5:52 pm on Aug 18, 2007 (gmt 0)

5+ Year Member

utf8_general_ci is case insensitive. utf8_bin is binary, so it's case sensitive (possibly in addition to other subtler things).

The mysql documentation ( [dev.mysql.com...] ) says it uses "_cs" for case sensitive collations, but one isn't listed in [dev.mysql.com...] so I would suppose that utf8_bin is your only choice for case sensitivity.

It all depends on your application which one would be better.

7:54 pm on Aug 18, 2007 (gmt 0)

5+ Year Member

thanks for your answer

May I ask your opinion on something? I'm currently developping a blog application in which users could perform a search in blogs posts or post titles. In this case, it would then make sense to use utf8_general_ci, don't you think?
I'm looking for kind of a best practice for this thing, so you opinion is highly welcome

8:48 pm on Aug 18, 2007 (gmt 0)

5+ Year Member

Yeah, for regular text searching you could use utf8_general_ci, case insensitivity is what everyone expects. Unless the language being searched isn't English, then you may want to use utf8_unicode_ci. Of course for some languages you'd want to use their specific collation.

But regardless of the default collation set for a database or table, you can override it for an individual query. (see [dev.mysql.com...]

9:16 pm on Aug 18, 2007 (gmt 0)

5+ Year Member

thanks again :)

just another quick question for my whole understanding of this thing:
lets say I create my table with the utf8_general_ci collation. Does changing it in the future into utf8_unicode_ci would have an impact on my table data? do some caracters could change in the table after such an change?
Is that something I should/should never do?
Or should I never change the collation once my choice is made and Collate in my SQL request if needed?

9:46 pm on Aug 18, 2007 (gmt 0)

5+ Year Member

It is my understanding that the character set determines what binary values end up in the table, but collations only affect the processing (searching, sorting) of values.

I would think (but cannot say for certain or guarantee) that changing a default collation from utf8_general_ci to utf8_unicode_ci would not change the table data. However if you changed the collation to something like latin1_ci and didn't specify a character set, then mysql would set the character set to latin1 and that would indeed change data in the table.

9:52 pm on Aug 18, 2007 (gmt 0)

5+ Year Member

thanks a lot, your helped me a lot in the understanding of this thing

Featured Threads

Hot Threads This Week

Hot Threads This Month