Forum Moderators: coopster
I have some chunks of text where I need to remove duplicate words. e.g. "one one, two three, two" should be "one, two three,)".
I could do that using arrays etc, but it's quite slow doing that way because of the size of the text. Is it possible to achieve the same result using regular expressions and preg_replace?
Many Thanks
>> Is it possible to achieve the same result using regular expressions and preg_replace?
I do not think you are going to be able to do this with regular expressions.
I'd say the best way you are going to be able to do something like this is to explode [php.net]() the string into an array of words, and then use array_unique [php.net]() to get rid of the duplicates.
Good luck!
$string = "one one, two three, two";
$string = preg_replace("/([,.?!])/"," \\1",$string);
$parts = explode(" ",$string);
$unique = array_unique($parts);
$unique = implode(" ",$unique);
$unique = preg_replace("/\s([,.?!])/","\\1",$unique);
echo $unique;
It's not the best solution, but it works as long as there isn't the same punctuation in the string, as well.
If your solution works flawlessly then I'd go with it. My code above may work for most instances, but it won't work for sentences with multiple periods, commas, etc... Whatever suits your needs best. :)