Forum Moderators: coopster
I am using PHPmyAdmin, and need a quick way to remove duplicate cities from my database, and leave only the first one.
Here is a piece of my database below as an example:
id - zip - city - state - abvstate
55974 80001 Arvada Colorado CO
55975 80002 Arvada Colorado CO
55976 80002 Denver Colorado CO
55977 80002 Wheat Ridge Colorado CO
55978 80003 Arvada Colorado CO
55979 80003 Westminster Colorado CO
I also need to make sure it will leave the following untouched: example, Jonestown in California, and Jonestown in Michigan since they are in different states.
Is there a quick way in PHPmyAdmin to do this?
I just thought it would be cleaner just to remove the dupicates, since the database will never be written to again after this.
This looks like it is close below, but it doesn't work yet.
<?
dbConnect("$dbname");
$sqlquery = "SELECT * FROM citytable";
$query = mysql_query ("$sqlquery");
$numrows = mysql_num_rows($query);
if ($numrows) {
while ($I < $numrows) {
$id = mysql_result($query,$I,"id");
$city = mysql_result($query,$I,"city");
$state = mysql_result($query,$I,"state");
echo "<BR>$city $state";
//compare against second query
$sqlquery2 = "SELECT * FROM citytable";
$query2 = mysql_query ("$sqlquery2");
$numrows = mysql_num_rows($query2);
if ($numrows2) {
while ($I2 < $numrows2) {
$id2 = mysql_result($query,$I2,"id");
$city2 = mysql_result($query,$I2,"city");
$state2 = mysql_result($query,$I2,"state");
If ($city = $city2 and $state = $state2 and $id!= $id2)
{
DELETE FROM citytable WHERE id = '$id2';
}
$I2++;
}
}
$I++;
}
}
?>
Am I on the right track?
Thanks!
SELECT DISTINCT city, state FROM your_table;
To remove duplicates one off I would have created a second table and added unique key/primary key on city and state and selected into that table like
INSERT IGNORE SELECT * FROM firstTable INTO secondTable
which will ignore erros caused by the duplicates.
But, for a city database, that is kind of foolish, if you have cities for all states, .. example I know there is a Eurika in just about every state. To keep duplicates from happening you could set a primary key on two fields, such as
PRIMARY KEY (city, state)
And that will keep it clean from the start
to find out how many dupicates you have before you start you could do something like
select count(*) - count(DISTINCT city, state) as 'duplicat cities' from myCityList;
or if you wanted to see a count of each of the duplicates you could do something like
select count(*) as repetitions, city, state from myCityList group by city, state having repetitions > 1;
which is nice when you are doing a data merge, just to see what is happening.
if the duplicat is only a duplicate if the whole row is the same, you can do something like this
create table tmpCityState select distinct * from myCityState;
select * from tmpCityState order by city, state;
then drop the original table, and alter the tmp table to the original table's name.
many ways of doing things like this.
webadept -- out