Forum Moderators: coopster
My preg_match_all functions are able to extract from basic links like:
<a href="www.page.com" title="titulok">text1</a>
or <a href='www.page.com' title='titulok'>text1</a>
so the problem is when word title appears twice in link.
I assume that it could be solved by good regular expression in
preg_match_all function.
I read pattern syntax manual at www.php.net but I am still not able to create expression which will extract from links like in examples 1), 2) and 3).
Could anyone help how to change expression:
'/<a.*.title=".*.a>'
and
'/<a.*.title=\'.*.a>'
to expression which are able to extract from links in examples?
Thank you very much.
title="they'll play hockey" or title='She "likes" to cook' So there is that issue with your examples above.
Now, assuming this isn't an issue and you are going to JUST use double quotes (") the pattern to grab everything should be as follows:
$pattern = "/<a\s*href=\"([^\"]+)\"\s*title=\"([^\"]+)\"[^>]*>([^<]+)<\/a>/i";
If you try that with double quotes you should find that it'll work. If you need it to work for any type of quote then we'll see what we can do with that later.
In any case, good luck and Welcome to WebmasterWorld! :)