janharders - 2:18 am on Feb 11, 2010 (gmt 0) [edited by: phranque at 6:53 am (utc) on Feb 11, 2010]
should be s/(<a )(.*)?(<\/a>)/$&/;
no. apart from the fact that you cannot quantify backtracking matches (at least not like that), in this case, it wouldn't even make sense: .* means any character 0 or more times, and putting the ? outside of that would (if it was possible) mean to match "any char 0 or more times" 0 or one time (as that's pretty much what ? means ... bbba?bbb matches bbbbbb and bbbabbb, but not bbbcbbb), while, if it's put into the brackets, it makes the unlimited * ungreedy, basically saying "match any character 0 or more times, but as few times as possible", which, in this case, is necessary, because the .* would happily match the ending </a> etc.
in general: unless you're hacking stuff together for a quick fix, it's usually not the best idea to operate on markup languages like html or xml with regexps. not only is it hard to match what you want, you cannot define complex patterns which you could easily define with something like HTML::TreeBuilder [search.cpan.org], which will offer you look_down where you could simply look for all a-nodes that contain a b-node and a img-node which, itself, has a src matching a certain url-pattern.
regexps on html are able to fix easy problems, but are generally a bad idea, because they tend to break stuff 5 months from now when nobody remembers they're in effect.
[edit reason] disabled graphic smileys ;) [/edit]
[edited by: phranque at 6:53 am (utc) on Feb 11, 2010]