I'm trying to extract some text from between a <H1> and </H1> tag and change it to sentence case I can do the opening files and change the information once it's out however I can't get the information out cleanly.
My problems occur if the <H1> tag has attributes e.g. <H1 align="center"> etc. I can't find any way of matching all the possibilities and then working out where the text I want starts.
Thanks in advance for your help
use strict;
use warnings;
use Carp;
sub titlecase($) {
join(' ', map{ucfirst("$_")} split(/\s/, lc shift));
}
my $old_h1 = qq(<h1 class="This is a class" align="center">THIS IS THE HEADER</h1>);
my ($head_start, $head_text, $head_finish) =
$old_h1 =~ m¦(<h1\s.*?\>)(.*?)(</h1>)¦i;
my $new_h1 = $head_start . titlecase $head_text . $head_finish;
print "Old: $old_h1\n";
print "New: $new_h1\n";
Output:
Old: <h1 class="This is a class" align="center">THIS IS THE HEADER</h1>
New: <h1 class="This is a class" align="center">This Is The Header</h1>