Forum Moderators: coopster & phranque

Message Too Old, No Replies

Perl regexp mystery

m{}isg works, s{}{}isg doesn't

         

dingman

7:13 pm on Oct 2, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm trying to remove all occurrences of a particular set of strings from an input text. When I just match on my regexp, it works. If I change that match to a substitution, I get an error about using an unitialized value in a concatenation on that line. This problem only occurrs if I use the match as a condition in an 'if' statement. ie:
if ($bib =~ m{(,\s*?eds{0,1}(\s¦\.¦<))}isg)
{
print STDERR "$1";
}

prints every occurrence of the strings I want to kill, but

if ($bib =~ s{(,\s*?eds{0,1}(\s¦\.¦<))}{}isg)
{
print STDERR "$1";
}

gives me a warning about using an uninitialized variable in a concatenation on the line with the regexp. (NOT the line with the print statement. I've even tried deleting the print statement to be sure.)

and

$bib =~ s{(,\s*?eds{0,1}(\s¦\.¦<))}{}isg;

works just fine, but doesn't let me branch based on whether there was a match or not.

What am I missing to make sense of this?

Damian

12:57 pm on Oct 4, 2002 (gmt 0)

10+ Year Member



> branch based on whether there was a match or not

Hope I understood the problem properly...I dont know why you get that error message, I never tried that construction. Frankly I don't understand why you want to use the substitution function if you replace the match by nothing anyway..

I suggest, if a match..print the match..else print the original $bib..

if ($bib =~ m{(,\s*?eds{0,1}(\s¦\.¦< ))}isg)
{ print STDERR "$1"; }
else {print STDERR "$bib"; }

Alternatively...maybe you could write something like

$bib =~ s{(,\s*?eds{0,1}(\s¦\.¦< ))}{}isg;
if ($1) {print "$1";}
else {print "$bib";}

undef $1;

dingman

7:25 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Frankly I don't understand why you want to use the substitution function if you replace the match by nothing anyway.

$bib comes in in one format, and needs to leave my script looking rather different. I'm not ready to print $bib, and my only use for the text that matches my expression is that I want it to go away, while keeping everything before it and after it. s{expresions_i_don't_want}{}isg seems nicer than

while ($bib =~ m{(.*?)expression_i_don't_want(.*)}is)
{
$bib = $1 . $2;
}

In this case, the only reason I want to branch on it is so that I can print diagnostics while I'm working on the script. Since not trying to branch on it works just fine, for this script, I don't really *need* an answer, but I want to know why it happens the way it does.

amoore

7:47 pm on Oct 4, 2002 (gmt 0)

10+ Year Member



are you sure that the s///g construction sets $1 and allows you to use it after the substitution? I believe you can only use it on the right hand side of the s///g and then it goes out of scope.
When you try to "print STDERR $1" since $1 is undef, you get that error.
Does that make sense?

dingman

8:34 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It would make a lot more sense if I didn't have the same problem even when I don't make any reference to $1 in the block executed by the if statement. I've tried 'print STDERR "found one\n"' as the only contents of the loop, and it still complains about an uninitialized value in a concatenation or interpolated string on the line with the pattern.

andreasfriedrich

6:31 pm on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I believe you can only use it on the right hand side of the s///g and then it goes out of scope.

Wrong.

"The numbered variables ($1, $2, $3, etc.) and the related punctuation set (<$+, $&, $`, and $') are all dynamically scoped until the end of the enclosing block or until the next successful match, whichever comes first."

perlre - Perl regular expressions [perldoc.com]

Andreas