Welcome to WebmasterWorld Guest from 54.161.105.216

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Merging Data Tables and removing Duplicates

     
4:29 am on Jul 28, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 20, 2008
posts: 172
votes: 0


Hi all,

I am looking for an efficient means of merging a number of mysql tables into one database table. They all have the same column names, but I want to ensure no duplicates are added to the merged output. Imagine there is an email field, which is the field I want to test duplication on.

I am not sure whether this can be done with pure mysql, but I will be merging a number of table rows (1000's) at one time, and need it to be as light as possible.

The merging with PHP is ok, but the duplicate check seems to be where the resources use increases.

Thanks in advance.
7:19 am on July 28, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member dreamcatcher is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 30, 2003
posts:3719
votes: 0


Create a new table and make the e-mail field unique, this will prevent duplicates. Then copy data between tables using IGNORE to skip over any duplicates without errors.

INSERT IGNORE INTO Table2 SELECT FROM Table1;

Think that should work.

dc
10:44 am on July 28, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 20, 2008
posts:172
votes: 0


Thanks Dreamcatcher, good suggestion, except that it appears a "TEXT" type field (the email field) in a mysql db table cannot be set as unique for some reason. It appears only numeric values can? Is this correct? Setting a unique value here would work though? Below is the mysql error:



#1170 - BLOB/TEXT column 'email' used in key specification without a key length


Any help would be appreciated.
7:11 am on July 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member dreamcatcher is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 30, 2003
posts:3719
votes: 0


Does the e-mail field have to be a text field? I would have thought a varchar(250) would have been plenty. A text field seems overkill.

dc
6:09 am on July 30, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 20, 2008
posts:172
votes: 0


Thanks guys, that did it.