Forum Moderators: open

Message Too Old, No Replies

Import content with special character encoding (UTF8, etc)

         

Imaster

5:40 am on Oct 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am tring to import Wikipedia content into Mysql, but since there are so many different character encodings, it is not being imported properly. I am using an exe created in VB.net to import the data columnwise.

What character set and collation do I need to set for MySql tables. Also, should I set my exe to import with utf8 options or something like that.

I have been trying it since a couple of days now and its become real frustrating. Any inputs would be appreciated.

tomda

6:33 am on Oct 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First, you must check which version of Mysql you are running...

MySql 4.1 supports UTF-8 charset/collation, that is more than 650 languages. I have tried it, it works great... although a mod told me that Mysql 4.1 has still some few bugs but it will fully functional in MySql 5.
Read also [webmasterworld.com...] explaining how to work in a UTF-8 free-hassle environment.

If you use an older version, then you need a specify a different charset for each language, that is having a table for each language with its own charset.... A nightmare!

Hope it helps.
Tomda

Imaster

10:08 am on Oct 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Tomda,

I use Mysql 5. I will check up the link you provided.

Imaster

2:51 pm on Oct 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One more issue - slightly specific.

My question is about importing UTF8 characters into MySql via 2 methods. One via MySql import feature and second via inserting using a tool created in Visual Basic.

I am trying to insert the character Č which has chararacter code as Č

- Using MySql import feature, it looks perfect Č when I view it using MySql Front.

- Importing it using the exe tool created by me, it looks like this Č. (I need to import using this tool)

I have no idea why it is happening like this. I am even importing it as utf8.

Any inputs?

jvmills

7:56 am on Oct 10, 2006 (gmt 0)

10+ Year Member



Are you using the fso.CreateTExtFile or similar Method in yopur vb tool?

If yes check that you set the second argument which deals with unicode output.

Google for fso.CreateTextFile Method and you will see a full reference.