Forum Moderators: coopster

Message Too Old, No Replies

Regex Help

For removing tags

         

Nick_W

11:43 am on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi all,

Firstly, anyone knows an 'idiots guide' for PHP regex (I've got brain ache from the manual) please tell me.

I'm trying to extract the tags and contained text from a line like this:

<pre>all this and the tags should match</pre>

This is the best I've managed so far:

preg_replace("/<pre>*<\/pre>/", "test", $data);

As you can see, my '*' is not matching everything inbetween the tags. Can someone please show me how to split up that regex so that it matches the whole block of <pre> text...?

Many thanks...

Nick

Birdman

12:18 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello Nick,

Since the pros aren't around at the moment, I'll take a crack and say put parens around the *


preg_replace("/<pre>(*)<\/pre>/", "test", $data);

or

preg_replace("/<pre>(.*)<\/pre>/", "test", $data);

I'll let you know if I find the "idiot's guide", I need it too ;)

Nick_W

12:31 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right! - Nearly there,

That works on something like this:

<pre>blah</pre>

but not this:

<pre>
some stuff
more stuff
</pre>

I need to match anything, any amount of times between the tags. I think the '.*' means 'any char except a newline, any amount of times'....

Thanks

Nick

brotherhood of LAN

12:38 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Birdman's one just needs an extra character for newlines

preg_replace("/<pre>(.*)<\/pre>/im", "test", $data);

the "i" is to make the search caseless (maybe needed) the "m" is for multiline, from the PHP manual. HTH.

>idiots guide

I bought a book, if it takes more than a day or two to learn something, buy a book on it IMO ;) It leans towards Perl though, PHP regex is mostly the same..."Mastering Regular Expressions". It's an O'Reilly effort.

Nick_W

12:47 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



preg_replace("/<pre>(.*)<\/pre>/m", "test", $data);

Hmmm... still seems only to match the one liner (<pre>text</pre>) not multi-line...

Thanks for the book tip!

Nick

Birdman

12:47 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks BOL, I was onto the /m thing but couldn't figure out exactly where to put it.

BTW That autolink proxy is pretty sweet, isn't it? I got a copy from Andreas and use it on certain threads.

brotherhood of LAN

12:56 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>Thanks for the book tip!

Go read the book then, I'm sure they have it right ;o) On second look that might work with "s"

m (PCRE_MULTILINE)
By default, PCRE treats the subject string as consisting of a single "line" of characters (even if it actually contains several newlines). The "start of line" metacharacter (^) matches only at the start of the string, while the "end of line" metacharacter ($) matches only at the end of the string, or before a terminating newline (unless D modifier is set). This is the same as Perl.

When this modifier is set, the "start of line" and "end of line" constructs match immediately following or immediately before any newline in the subject string, respectively, as well as at the very start and end. This is equivalent to Perl's /m modifier. If there are no "\n" characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.

s (PCRE_DOTALL)
If this modifier is set, a dot metacharater in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.

>the autolinker
Yeah birdman it's v.cool. From the point of view you can put out references without remembering them, and all that teach a man to fish stuff.....it's an ingenious idea. Why type a URL when an autoproxy can do it. Andreas knows his stuff, pity he can't be here 24/7 ;o)

//added
spelled birdman's name wrong.

Nick_W

3:41 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, this has been a killer but, I finally got it straight ;)

preg_replace("/<pre>.*?<\/pre>/s", "test", $data);

Does the trick nicely...

Thanks guys..

Nick

DrDoc

6:33 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Idiot's Guide to RegExp"

[pcre.org...]
[perldoc.com...]

Then, of course, you need to take a look at this page (http://www.php.net/manual/en/pcre.pattern.syntax.php [php.net]) for differences between Perl and PHP

andreasfriedrich

7:22 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Missed me ;-). Sorry about not being there but I have to work sometimes, too.

Jeffrey Friedl´s Mastering Regular Expressions is definitely worth a read when you want to get serious about regular expressions. The resources pointed to by DrDoc are good, too.

As for the auto-linking proxy it is available if you want it. I´m just not too sure about the official WebmasterWorld position on using it.

Andreas

DrDoc

7:33 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<wants the auto-linking proxy> :)