Forum Moderators: coopster
{* EMAIL *}
gets replaced with
somebody@somewhere.tld
I want to ignore (or eliminate) any whitespace between the {* *} delimiters, but nowhere else. Basically I'm looking for the equivalent of a regular expression that would not take accout of whitespace, sort of like
$pattern = "/EMAIL/i"
would not take account of case. The /x modifier will not do what I want since it ignores whitespace in the pattern, not in the text being searched.
The only thing I can think of for the moment is to explode, process, and implode the string, but I figure there has to be a better way.
Tom
$string = "adf asdf asdf asdf asd{* EMAIL*} asdfasdf asdf asdf{* E MAIL *}asdf asdf {*E MA I L *}";
//if there's whitespace between {* and *, keep looping
$match = "/\{\*[^\s\*]*\s/i";
while (preg_match($match, $string)) {
// replace one whitespace character
$pattern = "/(\{\*[^\s\*]*)\s/i";
$replace = "$1";
$string = preg_replace($pattern, $replace, $string);
}
// no whitespace, now do the replacement I want
$pattern = "/\{\*EMAIL\*\}/i";
$replace = "email@domain.com";
$string = preg_replace($pattern, $replace, $string);
Since regex is fairly processor intensive, I was hoping to find a better way.
It isn't pretty...
$pattern = "/\{\s*\*\s*E\s*M\s*A\s*I\s*L\s*\*\s*\}/Uis";
$replacement = "email@domain.com";
$newstring = preg_replace($pattern, $replacement, $string);...but it should do the job. Note: This regex is also checking for optional white space between the braces and the asterisks...
$pattern = "{*EMAIL*}"; # or anything
$spaceyPat = patternAllowsSpaces($pattern);
$newstring = preg_replace("/$spaceyPat/", $replacement, $string);
function patternAllowsSpaces($mystring) {
$mystring = preg_quote($mystring);
$mystring = preg_replace("/([^\\\])(?=.)/", "$1\s*" , $mystring);
return $mystring;
}
Thanks! I'll check that out.
Coopster, thanks for the idea. I should have specified that I won't actually know what the search string is except at run time. I will want to match the value, whatever it is (e.g. EMAIL) with something (probably the name of a constant or perhaps a value in a DB. To make your solution work, I would need to parse the string one character at a time to make the "pattern" part... or use Timster's fancy regex.
Essentially, these are meant to be template variables that the webmaster can use and I'm trying to make it so the script is as liberal as possible in what it accepts.
Now looking at it, I have to fess up it's brilliant, I just wish I understood it. I don't understand this part:
[^\\\]
I would assume that the first \ escapes the second and the third escapes the ] which would screw up the character clause. Obviously that's not the case. I get the lookahead, the backreference and all that. I just don't get the character class.
Works great, I just can't break it down correctly.
[^\\\]
The purpose of that homely little snippet is to match anything that is not a backslash. (The preceding line adds backslashes to the string that have special meaning, so we don't have to add "\s" after them.)
The sqaure brackets make a character class. If you write [a7,] that would match anything that's an "a" a "7" or a "," The ^ (caret) at the beginning off the character class negates it, so it matches anything that isn't in the brackets. Since the backslash has a special meaning, it has to be escaped (with another backslash).
But I confess, I don't really know why PHP demands three backslashes here instead of 2, except to say it seems to have something to do with how the line gets interpreted. (Only 2 backslashes are required in Perl or Grep to do this. Can anyone explain?)
It should be noted that the subroutine I posted won't work properly on any input string that contains a backslash.
I don't really know why PHP demands three backslashes here instead of 2,
That's the part that threw me. I was thinking "The first backslash escapes the second one, so that's 'not backslash', but then the third one escapes the bracket and that screws up the character class".
Coopster's explanation in the thread he referenced makes sense (or let me say, it has a simple rule that's applicable and easy to remember -"PHP parses the string first, then sends it to the regex engine" - whether or not it makes sense is another matter).
Sure enough, if I remove the third \, I get the php warning:
Warning: Compilation failed: missing terminating ] for character class at offset 11
So in other words, that parses out to "\]" and escapes the bracket closing the character class - exactly the effect I expected the three slashes to have!
Whew!
My warning was about backslashes ('\'). My regex didn't add a "\s*" after a backslash, but I went ahead and fixed that.
function patternAllowsSpaces($mystring) {
$mystring = preg_quote($mystring, '/');
$mystring = preg_replace("/([\\\]{2}¦[^\\\])(?=.)/", "$1\s*" , $mystring);
return $mystring;
}
# [\\\]{2} Matches exactly 2 backslash characters, which means a literal backslash
I'm glad I posted. I learned a fair bit about some of the obscurities (to me anyway) of regular expressions. I would never have gotten that on my own.
BTW, this thread also makes me think that it's better to single quote replacement strings with backreferences. I hadn't thought before of what would happen in PHP if I had a variable named $1 and a replacement pattern of "$1\s*".
Tom