Forum Moderators: bakedjake

Message Too Old, No Replies

Tricky RegExp problem

Trying to match the outermost of a set of nested braces

         

Simon Coggins

5:24 pm on Jun 30, 2003 (gmt 0)

10+ Year Member



Hi,

I'm having some problems building a regular expression I need. I'm trying to match the outermost of a set of braces, and return the contents as a back reference.

For example, in the following lines:


caption{Some text}
caption{Some text {with braces} inside}
caption{text {with {nested} {braces} inside}}
caption{text} then more text {possibly in braces}

I'd like to match:


Some text
Some text {with braces} inside
text {with {nested} {braces} inside}
text

Initially I tried something simple:


sed 's/caption{\(.*\)}/\1/'

It's the last two that are causing problems. If I allow the expression to be greedy, the fourth one matches this:


text} then more text {possibly in braces

but if I don't then the third one matches this:


text {with {nested

Equally replacing .* with [^{]* or [^}]* isn't what I'm looking for.

It seems like I need either some kind of recursive reg exp (is that even possible?), or a way of counting the {'s and allowing that many }'s.

Any help much appreciated!

Simon

bird

5:39 pm on Jun 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You are right in that you need recursion. Unfortunately, that's not possible with REs, so you'll have to cook up a "real" parser. I have no idea whether that's possible with sed, though.