Forum Moderators: coopster

Message Too Old, No Replies

how to get this string from a line?

         

zozzen

4:11 pm on Apr 13, 2009 (gmt 0)

10+ Year Member



Hello,

I'm trying to get a string #*$!X from a line like this:
<a href="#*$!#*$!#*$!">

Can you teach me how to do it?
Which php command should I take a look?

Thanks a lot!

eelixduppy

4:25 pm on Apr 13, 2009 (gmt 0)



Something like this is usually achieved with regular expressions. Try out the following:


$pattern = "/<a\s+href=['\"]([^'\"]+)['\"][^>]*>/i";
$matches = array();
if(preg_match_all($pattern, $string, $matches)) {
echo '<pre>'; print_r($matches); echo '</pre>';
} else {
echo 'Cannot find any matches in the string.';
}

The pattern can actually get a bit more complex than that since, for example, one could put singles quotes within double quotes and vice versa, but that should be a good place to start.

For additional information regarding regular expressions check out the syntax: [us3.php.net...]

[edited by: eelixduppy at 4:48 pm (utc) on April 13, 2009]

eelixduppy

4:26 pm on Apr 13, 2009 (gmt 0)



By the way, you should view the source to see it correctly, otherwise the non-terminating anchor tags will show something funky. :)

zozzen

7:10 pm on Apr 13, 2009 (gmt 0)

10+ Year Member



thanks a lot eelixduppy! The regex always makes me crazy. Thanks for the example and the link for further information!

rocknbil

11:51 pm on Apr 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




$pattern = "/<a\s+href=['\"]([^'\"]+)['\"][^>]*>/i";

I would do this a little differently to account fo unquoted and malformed code

< a href = somelink.html>

or inclusion of other attributes and their possible rearrangement.

<a class="something" href="somelink.html" title="oops">
<a title="oops" class="something" href='somelink.html'>

$pattern = "/<.*href\s*=\s*['\"]*([^'\">]+)['\"]*>/i";

< - starts with

.* - followed by zero or more of any character

href followed by the important part, "href"

\s*=\s* - followed by =, with zero or more spaces on either side

['\"]* - followed by zero or more quotes

( - Start saving match in $1

[^'\">]+ - followed by any character NOT a ',", or > (> is needed in this part if the link is unquoted)

) - end saving match in $1

['\"]* - followed by zero or more quotes - without this, if they exist, the quotes will get slurped in $1

> - end of pattern

Needs to be tested, but you can see the approach might be more forgiving of malformed or unexpected code.

zozzen

4:43 pm on Apr 15, 2009 (gmt 0)

10+ Year Member



Thanks rocknbil!