homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderator: open

Databases Forum

How to remove duplication across two columns
Need help adapting this MySQL query

 2:35 pm on May 7, 2014 (gmt 0)


I've been using this query to remove duplicate entries from a database:

DELETE FROM $tableName USING $tableName, $tableName e1 WHERE $tableName.id > e1.id AND $tableName.product_code = e1.product_code

This query removes all duplicated product codes but now I'm transferring to a new data model which means each product code can have various attributes. For example:

id | product_code | attribute |
1 | 1234 | red |
2 | 1234 | white |
3 | 1234 | white |
4 | 2345 | purple |
5 | 3456 | red |

If I run my query, I'd be left with this:
1 | 1234 | red |
4 | 2345 | purple |
5 | 3456 | red |

But I'm actually looking for this result:
1 | 1234 | red |
2 | 1234 | white |
4 | 2345 | purple |
5 | 3456 | red |




 4:26 pm on May 7, 2014 (gmt 0)

First, let's look at the correct query you should be using to identify the data you want to keep (you can use MIN instead of MAX- it works either way):
SELECT product_code, attibute, MAX(id) AS keep_id
FROM $tablename
GROUP BY product_code, attibute

This gives you each unique product_code/attibute pair, plus an associated id (keep_id). To remove dupes, delete everything in the table that doesn't belong to one of those ids:
DELETE FROM $tablename
FROM $tablename
GROUP BY product_code, attibute)


 12:01 am on May 11, 2014 (gmt 0)

If your running this query because the table shouldn't contain duplicate values then you should prevent this from happening in the first place by:
a) Preventing it by using a unique index/key non the table.
b) Testing for existing data before trying to insert duplicate values.

It also sounds like you should be using 2 tables here, one for product codes and another for product/attributes.


 4:11 am on May 11, 2014 (gmt 0)

Are you actually removing the duplicates, or merely filtering results?

Best way to ensure specifics is to manage the actual database with UNIQUE entries.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved