Forum Moderators: coopster
For example if I have this in a string:-
<h2>Shining your Widget</h2>Text about shining widgets.
<h2>Don't forget to buy polish!</h2>
Go to the shop, buy some carnuba wax.
<h2>Now sell it on eBay</h2>
Add postage and packing and off you go.
I'd like returned an array containing three elements:-
array[0] = "Shining your Widget"
array[1] = "Don't forget to buy polish!"
array[2] = "Now sell it on eBay"
I keep coming back to STRTOK, but as far as I can tell it will only accept single token delimiters, so I can't do something like:-
$delim = "<h2> </h2>";
$inbetween = strtok($theString, $delim);
Can I make STRTOK work the way that I need, or if not is there a simple way to grab all text between <h2> tags in an array?
Thanks!
[uk.php.net...]
Scroll down to comments - basically a parser that grabs all values in between two tags into an array.
Simply super :)
function getStrsBetween($s,$s1,$s2=false,$offset=0) {
if( $s2 === false ) { $s2 = $s1; }
$result = array();
$L1 = strlen($s1);
$L2 = strlen($s2);$i = 0;
if( $L1==0 ¦¦ $L2==0 ) {
return false;
}do {
$pos1 = strpos($s,$s1,$offset);if( $pos1!== false ) {
$pos1 += $L1;$pos2 = strpos($s,$s2,$pos1);
if( $pos2!== false ) {
$key_len = $pos2 - $pos1;$this_key = substr($s,$pos1,$key_len);
$result[$i] = $this_key;
$i++;$offset = $pos2 + $L2;
} else {
$pos1 = false;
}
}
} while($pos1!== false );return $result;
}
Usage, using my first post as an example - grab all strings between all <h2> tags and dump to an array:-
$myNewArray = getStrsBetween($myString, "<h2>", "</h2>");
$pattern = "#(?<=<h2>).+(?=</h2>)#i";
$matches = array();
preg_match_all($pattern,$string,$matches);
echo '<pre>';
print_r($matches);
echo '</pre>';
I messed around with some nifty assertions [uk2.php.net] here ;)
function getStrsBetween($s, $s1, $s2 = false)
{
if( $s2 === false ) {
$s2 = preg_replace('/</', '</', $s1);
}
$s1 = preg_quote($s1, '/');
$s2 = preg_quote($s2, '/');
preg_match_all("/$s1(.+)$s2/i", $s, $matches);
return $matches[1];
}
print '<pre>';
print_r(array_map('htmlentities', getStrsBetween($string, '<h2>')));
print '</pre>';
exit;
preg_match_all [php.net]("/$s1(.+)$s2/i", $s, $matches);
The $matches array stores the pattern and any subpatterns in array indexes starting with zero. So ...
Array (
[0] Contains an array of the entire pattern in the regex
[1] Contains an array of the 1st set of parenthesized subpatterns
[2] Contains an array of the 2nd set of parenthesized subpatterns
...
)
If you want to see what I mean, dump the array out before returning $matches[1] in the function as follows:
function getStrsBetween($s, $s1, $s2 = false)
{
if( $s2 === false ) {
$s2 = preg_replace('/</', '</', $s1);
}
$s1 = preg_quote($s1, '/');
$s2 = preg_quote($s2, '/');
preg_match_all("/$s1(.+)$s2/i", $s, $matches);
print '<pre>';
print_r($matches);
print '</pre>';
exit;
return $matches[1];
}
Nope; I thought I tried what you have and didn't get the expected results. I was confused at those results, but I kept going until I had something that worked :)
I have to say, though, that I like your pattern better ;)