homepage Welcome to WebmasterWorld Guest from 54.204.127.56
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderators: physics

Databases Forum

    
How to remove duplication across two columns
Need help adapting this MySQL query
adder




msg:4669251
 2:35 pm on May 7, 2014 (gmt 0)

Hi,

I've been using this query to remove duplicate entries from a database:

DELETE FROM $tableName USING $tableName, $tableName e1 WHERE $tableName.id > e1.id AND $tableName.product_code = e1.product_code

This query removes all duplicated product codes but now I'm transferring to a new data model which means each product code can have various attributes. For example:

id | product_code | attribute |
1 | 1234 | red |
2 | 1234 | white |
3 | 1234 | white |
4 | 2345 | purple |
5 | 3456 | red |

If I run my query, I'd be left with this:
1 | 1234 | red |
4 | 2345 | purple |
5 | 3456 | red |

But I'm actually looking for this result:
1 | 1234 | red |
2 | 1234 | white |
4 | 2345 | purple |
5 | 3456 | red |

Thanks

 

LifeinAsia




msg:4669288
 4:26 pm on May 7, 2014 (gmt 0)

First, let's look at the correct query you should be using to identify the data you want to keep (you can use MIN instead of MAX- it works either way):
SELECT product_code, attibute, MAX(id) AS keep_id
FROM $tablename
GROUP BY product_code, attibute

This gives you each unique product_code/attibute pair, plus an associated id (keep_id). To remove dupes, delete everything in the table that doesn't belong to one of those ids:
DELETE FROM $tablename
WHERE id NOT IN (SELECT MAX(id)
FROM $tablename
GROUP BY product_code, attibute)

DaveWave




msg:4670185
 12:01 am on May 11, 2014 (gmt 0)

If your running this query because the table shouldn't contain duplicate values then you should prevent this from happening in the first place by:
a) Preventing it by using a unique index/key non the table.
b) Testing for existing data before trying to insert duplicate values.

It also sounds like you should be using 2 tables here, one for product codes and another for product/attributes.

tangor




msg:4670212
 4:11 am on May 11, 2014 (gmt 0)

Are you actually removing the duplicates, or merely filtering results?

Best way to ensure specifics is to manage the actual database with UNIQUE entries.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved