Forum Moderators: coopster

Message Too Old, No Replies

replacing duplicate characters without looping

str_replace against multiple characters

         

salnajjar

5:13 pm on Jun 21, 2008 (gmt 0)

10+ Year Member



The quick question:
Does anyone know of a light weight way in php to take multiple occurrences of a single character in a string and reduce it to a single occurrence?

The long winded description of the problem:
I'm trying to write a small messaging/pseudo email system for a website I'm developing. I want to allow users to be able to send a message to multiple recipients and need a way to split the "recipients" string to allow them to send the message to multiple people.

I figured that the most common character people are likely to use to separate recipients is going to be a comma, but they might also use a semi-colon or even a few other characters.

The other issue is that I figure they will most likely, but not always, use a space after the separation character, or even have typos and enter the separation character twice.

So presently I take the input string and perform an str_replace on it:

$recipients = $_POST['recipients'];
$replacechars = array(";", ",", ":", " ");
$recipients = str_replace($replacechars, ",", $recipients);

The trouble is that this code means the resulting output string has multiple commas in it if the user has multiple separation characters between entries such as "user1, user2, user3" as this identifies both the comma and the space as a delimiter.

So, here's my predicament, I want to take the output of the str_replace function and reduce all multiple instances of the comma symbol to just one occurrence.

I could loop it so that it looks for ",," and replaces it with "," but for the sake of good input validation and trying to filter out the really stupid entries, I would have to process it about 4 times minimum to cope with the entries that came out as "user1,,,,,user2,,,,,user3,,,,," etc.

Sorry, I know this isn't the easiest to follow of messages, but I haven't been able to come up with a more simplified way of explaining the problem.

Anyway, does anyone know of a light weight way in php to take multiple occurrences of a single character and reduce it to a single occurrence?

Thanks for any pointers

Seri

cameraman

5:36 pm on Jun 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I could see doing it with two regular expressions; the first replaces the alternate separators, and the second reduces multiple instances of commas to one per.
Remember to replace the broken pipe character with the real one.
$recipients = preg_replace('#, ¦,¦; ¦;¦: ¦:#',',',$recipients);
$recipients = preg_replace('#,{2,}#',',',$recipients);

salnajjar

6:05 pm on Jun 21, 2008 (gmt 0)

10+ Year Member



cameraman, thank you so much, that worked a treat. As you can probably tell, regex's are still the bane of my life.

In order to try to ensure that I understand the majority of the code on my site, I only used the second regex expression, the one that reduces the multiple entries to a single.

In case anyone else needs the code, the final solution was:
$recipients = $_POST['recipients'];
$replacechars = array(";", ",", ":", " ");
$recipients = str_replace($replacechars, ",", $recipients);
$recipients = preg_replace('#,{2,}#',',',$recipients);

Thank you once again cameraman, you've helped a budding php scripter hugely.

Seri

cameraman

7:36 pm on Jun 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



regex's are still the bane of my life
Yeah me too. Actually not to that extreme, and it's getting better - I decided the only way to learn it was to start suffering through it. I can write them a whole lot better than I can read them.

This site is all about edification, so lemme 'splain. The pipe character is a logical OR to regex. So the first pattern:
#, ¦,¦; ¦;¦: ¦:#

is saying to match:
comma followed by a space OR
comma OR
semicolon followed by a space OR
semicolon OR
colon followed by a space OR
colon

The space'd versions need to be ahead of the non-spaced ones, otherwise a comma followed by a space, for example, would get matched as just a comma.

The second one is just saying to match two or more commas beside each other.

coopster

7:37 pm on Jun 21, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Welcome to WebmasterWorld, salnajjar.

You two have figured it out correctly, but are really overlooking something that may not be so obvious as it truly is. You can do it all in one shot, your first run. No need to run another replace.

$recipients = preg_replace("/[;,:\s]+/", ',', $recipients);

<added>
Since cameraman has set the standard here :) I best follow suit and explain that expression. The brackets start a character class. The plus sign after the character class say to find 1 or more of the characters in that class. So, it will be greedy and find as many match, including any space characters (\s). It will replace those found with a single comma.
</added>

salnajjar

9:30 am on Jun 22, 2008 (gmt 0)

10+ Year Member



Thank you coopster and cameraman.

The reason why I kept the first half of my script and the second of cameramans regex's was because I'm trying to counter even the most foolish of users and cover as many possibilities as possible.

Using my original str_replace combined with cameramans regex it would catch the recipients even if the sender typed something like:
recipient1, :::;;;;,,,,,,:recipient2,recipient3,:;; ,recipient4

But, now with coopsters regex I've been able to reduce it down to a single line, and even better, I actually understand it because you two have been so good as to explain the code too.

I have to say, I was pleasantly surprised to even get a working response, let alone this level of help and assistance.

Thank you so much for the help.