I have a variable ($sch_body_file) in my perl script which holds the contents of a html page. All links in this html page have double quotes (") at the end of the URL. For example
<A href=http://www.cokama.com/" target=_blank>a link to cokama</A>
Basically I need to strip the ending doublequote from all strings which begin with href=http://
The resulting code should look like
<A href=http://www.cokama.com/ target=_blank>a link to cokama</A>
Bearing in mind also there could be multiple links in the HTML page. Unfortunatly I'm not that good at regular expressions so if anybody can help, I would appreciate it.
Cormac.
($without_quotes = $sch_body_file) =~ s{(href=http://[^"]+?)"}{$1}g; This RE matches 'href=http://' followed by one or more characters that are not '"' followed by a '"' and substitutes it with 'href=http://' followed by one or more characters that are not '"'.
Hope this helps.
Andreas
Here's a little bit on chop from "perldoc -f chop"
chop Chops off the last character of a string and
returns the character chopped. It is much more
efficient than "s/.$//s" because it neither scans
nor copies the string. If VARIABLE is omitted,
chops $_. If VARIABLE is a hash, it chops the
hash's values, but not its keys.