Forum Moderators: coopster

Message Too Old, No Replies

Exploding String

Preserving HTML tags

         

ukgimp

9:50 am on Jun 30, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have some pseudo code for a function that I need to do.

I need to be able to take a string consisting of many <p> tags with the odd <h*> tag in there and explode on certain element.

So the following

$string = “<p>paragraph one</p><p>paragraph two</p><h2>Heading 2</h2><p>paragraph three</p><p>paragraph four</p><p>paragraph five</p><p>paragraph six</p><p>paragraph seven</p><p>paragraph eight</p>”

Would explode into an array of strings giving, with tags still intact.

Array[0] = “<p>paragraph one</p>”;
Array[1] = “<p>paragraph two</p>”;
Array[2] = “<h2>Heading 2</h2><p>paragraph three</p>”;
Array[3] = “<p>paragraph four</p>”;
Array[4] = “<p>paragraph five</p>”;
Array[5] = “<p>paragraph six</p>”;
Array[6] = “<p>paragraph seven</p>”;
Array[7] = “<p>paragraph eight</p>”;

So can you explode on <tags> and still preserve them.

$pieces = explode(" ", $string);

This is causing me grief.

Any suggestions greatly appreciated. Hell I will even give a gmail account to anyone who helps me oot :)

Cheers

dcrombie

10:21 am on Jun 30, 2004 (gmt 0)



You can keep the gmail account ;)

preg_match_all("/(<h.>.*<\/h.>)?<p>.*<\/p>/iU", $string, $matches); 
print_r($matches[0]);

The regexp can be cleaned up - using .'s is a bit lazy but it works.

ukgimp

10:39 am on Jun 30, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



dc

Can I be a cheky gimp and ask for a little explanation on that one, what is going on etc.

Cheers and are you sure about the gmail thing :)

dcrombie

11:27 am on Jun 30, 2004 (gmt 0)



I can try...

preg_match_all("/(<h.>.*<\/h.>)?<p>.*<\/p>/iU", $string, $matches);

preg_match_all [php.net] puts every occurence of the /regexp/ into $matches[0] and every occurence of (sub-regexp) into $matches[1].

The /regexp/ consists of an (optional)? string matching "<h.>.*</h.>" followed by a string matching "<p>.*</p>".

the /i means the match is case-insensitive.

the /U means that the match is UNGREEDY - otherwise <p>.*</p> could match the entire $string because a . matches any character and .* matches any number of them.

ukgimp

1:09 pm on Jul 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for that DC

I would like to add to this. It looks like a regex problem, but occasionally the string can start with a <h*> and then others are placed in there as well.

Again I have tried lots of variations to get all <p> and all <h*> tags into the array in the correct order.

I am presuming that this is something that can be done

preg_match_all("/<p>.*<\/p>/iU", $string, $array);

will match all the p's

preg_match_all("/(<h.>.*<\/h.>)/iU", $string, $array);

It is combining them in any oder that is throwing me.

Cheers

dcrombie

1:19 pm on Jul 2, 2004 (gmt 0)



Just replace the "?" with "*" ;)

ukgimp

1:23 pm on Jul 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your not joking are you lol

preg_match_all("/(<h.>.*<\/h.>)*<p>.*<\/p>/iU", $string, $matches);

Thats it :)