Forum Moderators: coopster

Message Too Old, No Replies

Regex (perl compatible)

A little help, i'm not that used to regex

         

Buyurun

10:54 pm on Aug 16, 2004 (gmt 0)

10+ Year Member



Hello there,

I'm currently writing a content management system and i'm trying to make my functions as tight as possible.

One of the parts of the CMS is the ability for the user to write articles/journal entries. Now, I have written all the functions but can't get my head around this:

I want to accept text from a form and replace all quotation marks (") in the text with the correct entity encoding. However, I don't want to replace the quotation marks within anchor tags or image tags, just the text itself.

For example:

$text = preg_replace('![\"+](.+)[\"+]!U', ""$1"", $text);

This perl compatible regex i'm using simply replaces all quotation marks with the necessary entity and then uses backreferences to "put the text" back in between the entities. However, it matches all the quotation marks in the anchor and image tags as well.

I could go on forever describing this, but i'm sure someone has the gist by now. If you could help me and save me some time, that'd be great.

I'm sure someone out there must dedicate all their time to regex :P

I apologise in advance if this has been covered before.

mattx17

5:03 pm on Aug 17, 2004 (gmt 0)

10+ Year Member



This may seem kind of wasteful, but it may be easier: why not replace all the quotes, then go through it again, and switch back any of the quotes that are between < and >?

timster

5:20 pm on Aug 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not all my time, just all my free time:

A lookahead assertion is what you need.

$output = preg_replace("/(\")(?=[^>]*(<¦$))/", '&#34;', $text);

In English: Translate a quote to '&#34;' if you find a "<" following it before you find a ">", or if you find the end of the string before you find ">".

This pattern isn't exactly bulletproof -- it will choke on valid (but weird) HTML that has quoted carets inside HTML tags. (But who does that?)

Buyurun

5:50 pm on Aug 17, 2004 (gmt 0)

10+ Year Member



Cheers for the help guys. However, I ended up using preg_split to ignore the tags and replaced the quotes with preg_replace. Doesn't seem to choke on anything yet.

I'll definitely try out your methods as well at some point.