Welcome to WebmasterWorld Guest from 54.224.253.195

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Detecting possible duplicates

     
11:45 am on Sep 21, 2011 (gmt 0)

Junior Member

10+ Year Member

joined:July 17, 2006
posts:137
votes: 0


I have an array called $names and this array consists of a list of full names fetched from a mysql db. i'm trying to figure out a way to check if any names have a 70% chance of being same as another name within the array and if so to list the ones that maybe duplicated. The reason is because sometimes one name may have been entered few times with different spellings.

Can anyone tell me if there is any way this can be done?

thanks
7:55 pm on Sept 21, 2011 (gmt 0)

Moderator from CA 

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 29, 2003
posts:4059
votes: 0


first, how will you measure similarity?

You probably want:
[php.net...]

You don't need to compare every element of the array to every other ((n^2)-n)... but to say how many comparisons you need I'll need more coffee in my system
4:44 pm on Sept 22, 2011 (gmt 0)

Moderator from CA 

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 29, 2003
posts:4059
votes: 0


formula for number of comparisons in a set is ((n^2)-n)/2

it's done with two nested loops

for ($i = 0 to $len) {
for ($j = $i+1 to $len) {
compare($i,$j);
}}