Forum Moderators: coopster

Message Too Old, No Replies

regular_expressions/ereg replace

         

foy

12:55 pm on Sep 21, 2004 (gmt 0)

10+ Year Member



hey there!

let's say I have a string that contains a specific word like "test", also the same string contains a number or letter after that specific word...like:

$string = "this is a test 1234 really";
or
$string = "this is a test dumdidum really";

I want to format the "1234" or the "dumdidum" with html-tags, so it would look like this:

$string = "this is a test <b>1234</b> really";

The word "test" is always at the same place followed by the part of the string I want to be replaced.

How do I do this via ereg_replace()?

the part after "test" varies.

thanks in advance

timster

1:24 pm on Sep 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That looks like a job for a positive lookahead assertion.
Use (?=stuff) to test if something follows what you need to replace/modify without "matching" it.


<?php
$input = "What a bold really word.";

$output = preg_replace("/(\w+)(?= really)/",
"<b>\\1</b>",
$input);

echo $output;
?>

foy

2:50 pm on Sep 21, 2004 (gmt 0)

10+ Year Member



thanks for your help!

I really have to get into that expression stuff myself someday...

timster

6:02 pm on Sep 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I really have to get into that expression stuff myself someday...

One you take the plunge, you'll wonder how you got through your day without it.

foy

8:27 am on Sep 22, 2004 (gmt 0)

10+ Year Member



lol I bet! =)

any good book to recommend that focuses on regexp?

coopster

2:47 pm on Sep 22, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



There are some links and resources in our PHP Forum Library [webmasterworld.com] under Learning PHP - Books, Tutorials and Online Resources [webmasterworld.com].

ergophobe

2:57 pm on Sep 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does the regex timster provided work? From how I read the question, it shouldn't because
- "test" is the consistent flag, not "really"
- foy wants the whole phrase back, not just the bolded backreference.

As I read the question, he wants something like this


$output = preg_replace("/(.*?)test\s(\w+)(.*)/",
"\1test <b>\2</b>\3",
$input);

Since you're trying to learn regex, a little explanation
() - groups the matches to be put back into the output with the \1 \2 etc
\n - backreferences - refer back to whatever is captured in ()
. - match any char
* - repeat zero or more times
? - be lazy, not greedy This is very important here, without that you would match to end of string
\s - match whitspace
\w - mathc "word" characters
+ - match one or more times

Tom

foy

11:22 am on Sep 23, 2004 (gmt 0)

10+ Year Member



okay... I'm learning this (more than trial and error anyways...)

another thing I cannot solve yet:

Lets say I have this string:

This Is a test - called "test"

I run 2x eregi_replace() on a string that does the following

$string = eregi_replace('([^a-zA-Z0-9. \"])','\\\1\\',$string);

which adds to any special chars with a backslash at the start&end of it.

and

$string = eregi_replace('([ +])','\\\1 ',$string);

which adds a backslash to the start of any backspace found.

so my result of above is:

This\ Is\ a\ test\ \-\\ called\ \"test\"

1) How can I get both eregi_replace()-arguments into one?
2) How can I fix the issue with the backslashes? As I would like to have a result like this:

This\ Is\ a\ test\ \-\ called\ \"test\"

so there's no double \\

timster

12:44 pm on Sep 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



- "test" is the consistent flag, not "really"
- foy wants the whole phrase back, not just the bolded backreference.

ergophobe - You've got me on the first point (gotta' work on my reading comprehension) but not the second. For a replace, you don't have to match the whole string, just the part you want to replace.

So here's a pattern with a positive lookbehind assertion:


<?php
$output2 = preg_replace("/(?<=test )(\w+)/",
"<b>\\1</b>",
$input);

echo $output2;
?>

outputs:

What a test <b>bold</b> really word.

foy-We'll get to your next question soon...

ergophobe

4:35 pm on Sep 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




So here's a pattern with a positive lookbehind assertion:

Show off!

I call with a postive lookahead and raise you one or and another positive lookahead (but I also ask you to read it and make sure I got it right since I'm really pushing my regex abilities here).

Foy, this should do it

$input = 'This Is a test - called "test"';
$output = preg_replace('/(([^\s](?=\s))¦.(?=[^a-zA-Z0-9\s]))/i', '\1\\', $input);
echo $output;

That gives the output you want though you might want to test it with a few more examples. I'm not sure if you want a \ before whitespace blocks or each space. Currently it does the former, but you may not want that.

Some other tests
This Is a te%st - called "test" => This\ Is\ a\ te\%st\ \-\ called\ \"test\"


This Is a te%st - called "test"
=> This\ Is\ a\ te\%st\ \-\ called\ \"test\"

If instead, you want


This Is a te%st - called "test"
=> This\ Is\ \ \ \ \ a\ te\%st\ \-\ called\ \"test\"

Then your regex would be

$output = preg_replace('/((.(?=\s))¦.(?=[^a-zA-Z0-9\s]))/i', '\1\\', $input);

Does that do it for you?

[edit]remember that this forum breaks pipes if you try to copy and paste, you will need to retype the ¦[/edit]

[edited by: ergophobe at 5:01 pm (utc) on Sep. 23, 2004]

ergophobe

4:47 pm on Sep 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



BTW, a little explanation.

There might be a better way to do it, but it was the only way I could think of that would deal with the situation of both non-alphanumeric chars followed by whitespace and followed by non-whitespace.

- lookahead and ask "Is the next char a space or a special char?" If yes, add a slash

((.(?=\s))¦.(?=[^a-zA-Z0-9\s]))


( - define the whole thing as a capture group
(.(?=\s)) - match any char that's followed by a space
¦ - or
.(?=[^a-zA-Z0-9\s])) - match any char that's
followed by a char that's not a
letter, number or space
) - close capture group

timster

6:32 pm on Sep 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hats off to the 'phobe -- that one takes the prize.

Still, there's no shame in doing things with 2 simple regex patterns instead of 1 work of art.

Here's a modification to your 2 patterns that makes the fixes you needed:


$string2 = eregi_replace('([^a-zA-Z0-9\. \"])','\\\1\\',$string); # Same as before

$string2 = preg_replace("/(?<![ \\\])( [ *])/",'\\ \\1',$string2);

(?<![ \\\]) Ensures there's no space or \ immediately prior. Hey, that's a fixed-width negative lookbehind assertion. (Looks like we're almost done the tour.)

This and ergo's solution is about as fast as the other. Go with the one you find easier to understand, since supporting all this "comic book swearing" will probably be the main concern.

BTW, ergophobe -- did I miss your passage into moderation? Congratulations!

coopster

7:17 pm on Sep 23, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



function escape_it($matches) { 
return '\\' . $matches[1];
}
// The original request:
print 'This\ Is\ a\ test\ \-\ called\ \"test\"<br />';
$string = 'This Is a test - called "test"';
echo preg_replace_callback("/([^a-zA-Z0-9])/", "escape_it", $string) . '<br />';



// ergophobe's example 1:
print 'This\ Is\ a\ te\%st\ \-\ called\ \"test\"<br />';
$string = 'This Is a te%st - called "test"';
echo preg_replace_callback("/([^a-zA-Z0-9])/", "escape_it", $string) . '<br />';
// ergophobe's example 2:
print 'This\ Is\ \ \ \ \ a\ te\%st\ \-\ called\ \"test\"<br />';
$string = 'This Is a te%st - called "test"';
echo preg_replace_callback("/([^a-zA-Z0-9])/", "escape_it", $string) . '<br />';
preg_replace_callback() [php.net] ;)

ergophobe

11:17 pm on Sep 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Heh heh. Just when you think you're on top of it. I would not have *ever* thought of that.