Welcome to WebmasterWorld Guest from 54.204.162.36

Forum Moderators: coopster & jatar k

Detecting possible duplicates

   
11:45 am on Sep 21, 2011 (gmt 0)

5+ Year Member



I have an array called $names and this array consists of a list of full names fetched from a mysql db. i'm trying to figure out a way to check if any names have a 70% chance of being same as another name within the array and if so to list the ones that maybe duplicated. The reason is because sometimes one name may have been entered few times with different spellings.

Can anyone tell me if there is any way this can be done?

thanks
7:55 pm on Sep 21, 2011 (gmt 0)

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member



first, how will you measure similarity?

You probably want:
[php.net...]

You don't need to compare every element of the array to every other ((n^2)-n)... but to say how many comparisons you need I'll need more coffee in my system
4:44 pm on Sep 22, 2011 (gmt 0)

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member



formula for number of comparisons in a set is ((n^2)-n)/2

it's done with two nested loops

for ($i = 0 to $len) {
for ($j = $i+1 to $len) {
compare($i,$j);
}}
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month