Forum Moderators: coopster

Message Too Old, No Replies

Preg match problem

         

FiRe

9:06 am on Jun 20, 2007 (gmt 0)

10+ Year Member



preg_match_all("/<url>(.*)<\/url>/", $data, $m);
print_r($m);

$data contains the following:

<url>
<loc>http://www.site.com/</loc>
<priority>0.5</priority>
<lastmod>2007-06-19T12:38:10+00:00</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>http://www.site.com/index.php?page=faq</loc>
<priority>0.5</priority>
<lastmod>2007-06-19T12:38:10+00:00</lastmod>
<changefreq>daily</changefreq>
</url>

Thing is it is not matching, $m always prints out empty! I think its something to do with \n and \r, any ideas? Thanks!

whoisgregg

3:07 pm on Jun 20, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Two problems. First the "." doesn't include newlines by default. Setting the "s" flag makes it do so. However if we do this:

preg_match_all("/<url>(.*)<\/url>/s", $data, $m);

It won't know where to stop and will match the entire document from the first opening <url> to the last closing </url>. This is called "greediness" in regular expressions and we can toggle it off with a question mark:

preg_match_all("/<url>(.*?)<\/url>/s", $data, $m); 

Here's all the info about php pattern modifiers [php.net] for you to check out. :)