Msg#: 3425464 posted 5:52 pm on Aug 18, 2007 (gmt 0)
utf8_general_ci is case insensitive. utf8_bin is binary, so it's case sensitive (possibly in addition to other subtler things).
The mysql documentation ( [dev.mysql.com...] ) says it uses "_cs" for case sensitive collations, but one isn't listed in [dev.mysql.com...] so I would suppose that utf8_bin is your only choice for case sensitivity.
It all depends on your application which one would be better.
Msg#: 3425464 posted 7:54 pm on Aug 18, 2007 (gmt 0)
thanks for your answer
May I ask your opinion on something? I'm currently developping a blog application in which users could perform a search in blogs posts or post titles. In this case, it would then make sense to use utf8_general_ci, don't you think? I'm looking for kind of a best practice for this thing, so you opinion is highly welcome
Msg#: 3425464 posted 8:48 pm on Aug 18, 2007 (gmt 0)
Yeah, for regular text searching you could use utf8_general_ci, case insensitivity is what everyone expects. Unless the language being searched isn't English, then you may want to use utf8_unicode_ci. Of course for some languages you'd want to use their specific collation.
But regardless of the default collation set for a database or table, you can override it for an individual query. (see [dev.mysql.com...]
Msg#: 3425464 posted 9:16 pm on Aug 18, 2007 (gmt 0)
thanks again :)
just another quick question for my whole understanding of this thing: lets say I create my table with the utf8_general_ci collation. Does changing it in the future into utf8_unicode_ci would have an impact on my table data? do some caracters could change in the table after such an change? Is that something I should/should never do? Or should I never change the collation once my choice is made and Collate in my SQL request if needed?
Msg#: 3425464 posted 9:46 pm on Aug 18, 2007 (gmt 0)
It is my understanding that the character set determines what binary values end up in the table, but collations only affect the processing (searching, sorting) of values.
I would think (but cannot say for certain or guarantee) that changing a default collation from utf8_general_ci to utf8_unicode_ci would not change the table data. However if you changed the collation to something like latin1_ci and didn't specify a character set, then mysql would set the character set to latin1 and that would indeed change data in the table.