There are some issues with this code though, the part which selects the html like attributes doesn't work on all versions of perl, i.e. the first (.*?) block. If I split that into two matches first w/o that block, and then one more with it works, but it's about twice as slow. Anyway even if it's not splitted it performs miserably on some boxes. Any ideas about optimizing this regex:
my $re = qr/$this->{mask_start}\Q$key\E(.*?)$this->{mask_end}(.*?)$this->{mask_start_close}\Q$key\E$this->{mask_end_close}/is;
while ($this->{template} =~ /$re/) {
my $params = $this->mask_block_params($1, $2);
my $html = &$callback($key, $params);
$this->{template} =~ s/$re/$html/is;
}
Changing the s/$re/$html/; to the same regex w/o grouping didn't make much of a difference.
However, my eye was caught by the substitution in the last line. My impression is that you find what you want to change, figure out what the new stuff should be, and then replace what was just found with the new stuff. If that is true, then instead of:
$this->{template} =~ s/$re/$html/is;
you could use:
$this->{template} = $` . $html . $';
$` is the part of the string before the match and $' is the part after. $& is the matched value. Using these is expensive, but you have already paid for it by using $1 and $2.
This might help, easy enough to try.
but it's already coded like that and not an easy thing to change.
Not knowing your application, it looks like you can swap out your code:
[perl]
while ($this->{template} =~ /$re/) {
my $params = $this->mask_block_params($1, $2);
my $html = &$callback($key, $params);
$this->{template} =~ s/$re/$html/is;
}
[/perl]
With something like
[perl]
my $p = new MyTokenizer($this->{template});
while (my $t = $p->get_token) {
if ($t->[0] eq 'S' and $t->[1] eq $key) {
# parse the special token
$this->{html} .= # parsed version
} else {
$this->{html} .= $t->as_html;
}
}
[/perl]
Come to think of it, you'd want to write your own tokenizer since HTML::TokeParser breaks up each token into components, and you only need to do it for special tokens.
I'm thinking of a simple scanner, you read char by char until you get to the next special character (< >) depending on if you're currently in a tag or not.
Sean