Forum Moderators: coopster
I'm trying to replace forum code (such as "[b]" and "[quote]"[smilestopper]) with the HTML equivalents. So far so good - my script works. But it can lead to invalid markup if forum code tags are nested. I cannot see a way round this.
My guess is that my expressions are too basic. I'm filtering the text and matching opening and closing tags. But what I need to do is look for opening tags and *only* replace them if they are followed directly by the identical closing tag to match.
Here's what I've got so far:
<style>i {color:red;}</style>
<?php
$text = "[b][i]A[/b][/i] [i]B[/i] b [b]C[/b] [b][i]D
d[/i][/b] E [/b] [i]F
[/i]G[/i]";
$pattern = "/\[b](.+)\[\/b]/Uis";
$pattern2 = "/\[i](.+)\[\/i]/Uis";
$text2 = preg_replace($pattern, "<b>\\1</b>", $text);
$text3 = preg_replace($pattern2, "<i>\\1</i>", $text2);
echo "<pre>
$text3
</pre>";
?>
This works, but the start of the output looks like this:
<b><i>A</b></i>
As you can see, the tags are misplaced. I tried filtering the text a third time to remove the problem, but it took out tags near the end of the text as well.
Is there a way to make a regular expression that uses something like 'non-greedy quantifiers' or 'lookahead assertions' to check for a closing tag ahead? I'm not sure exactly how to use those methods. I have looked at sample scripts but find them mind-bogglingly complex.
Hope someone can help.
Did you come up with this code yourself? I always try to credit people who've helped my code, so I would like to add a credit for you. (And if you did the code, are you not willing to explain it a little? I can see what some of it is doing, but the "$1$4$3" is intriguing. Is that re-ordering the words?)
This should allow for longer tags.
Sorry, it's a hopeless bunch of chicken scratch. But yes, I did write it.
Yes, $2$4$3 reorders the tags. You can look up "backreferences" for an explanation of what's going on here.
Be aware, this line will fall apart if there are more than 2 incorrect nested tags, e.g.:
[block][b][i]Stuff[/block][/b][/i]
<style>
i {color:red;}
b {color:blue;}
i b, b i {color:purple;}
blockquote {border:1px solid #ccc;}
</style>
<?php
$text = "[i][b]A[/i][/b] [b][i]B[/i][/b] b [b]C $9.99[/b] [b][i]D
d[/b][/i] E [/b] [i]F
[/i]G[/i] [q][b]This [b]is[/b] a [b][i]quote[/b][/i] here [i]like[/i] $ this.[/q]
Text
[q][i]hello[/q][/i]
More text";
$text = preg_replace("!" . '\x24' . "!", '\\$', $text); //replace dollars
$pattern = "/\[b](.+)\[\/b]/Uis";
$pattern2 = "/\[i](.+)\[\/i]/Uis";
$pattern3 = "/\[q](.+)\[\/q]/Uis";
$text2 = preg_replace($pattern, "<b>\\1</b>", $text);
$text3 = preg_replace($pattern2, "<i>\\1</i>", $text2);
$text4 = preg_replace($pattern3, "<blockquote>\\1</blockquote>", $text3);
//by timster - (URL snipped) - corrects wrongly nested tags
$text5 = preg_replace('/(<(\w[^>]+)>[^<]+)(<\/\w[^>]+>[smilestopper])(<\/\2>[smilestopper])/', "$1$4$3", $text4);
echo "<pre>
$text5
</pre>";
?>
If you see "[smilestopper]" in the above code, it's the forum - how do you get round that? Replace the code with the line in the post before mine.
I'm thinking now I need to go through the script and build an array. Then replace only valid codes. Something like this:
What do you think?
Also, does anyone know of a way to move through a string of text in such a way as to compare characters as you go? I always filter text by grabbing each line and splitting it (using the file pointer to progress). Is there a way to 'jump' to each opening bracket in the text to speed it up?