Forum Moderators: coopster & phranque

Message Too Old, No Replies

Regular Expression problem

         

DrDoc

7:27 pm on Jun 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I need to replace commas in a string, but only commas that are not inside a double-quoted string. For example, consider this string:

foo,bar,baz,"foo, bar, baz",blah,"widgets, or so"

How can I replace all commas (for the sake of this example, with #) to generate this string:

foo#bar#baz#"foo, bar, baz"#blah#"widgets, or so"

DrDoc

7:48 pm on Jun 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Currently I do this:

$myString =~ s/,/#/ig;
$myString =~ s/((?<=\")[^\"]+)#([^\"]+(?=\"))/$1,$2/ig;

That works great, except the fact that the last regexp only replaces the last occurance between the quotes. So, for the string above, I have to run it twice. If there are more commas, I have to run it more...

Anything I can do to get away from that?

DrDoc

5:36 am on Jun 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...and it currently takes forever to run on a larger set of data. Maybe the back-lookup needs to be improved.

nalin

6:04 am on Jun 18, 2004 (gmt 0)

10+ Year Member



ill take a stab feel...

(
(([^"]*)(,+)([^"]*)*)
"([^"]*)"
(([^"]*)(,+)([^"]*)*)
)

where (([^"]*) is greedy and "([^"]*)" is not...

I think...damn I am a geek, this is ugly im out...

wmwlurker

7:41 am on Jun 18, 2004 (gmt 0)

10+ Year Member



possibly slower; perhaps more obtuse:

$a = qq@ a, b, "c, d, e", f@;

# protect first
$a =~ s@(".*?")@ join("\a", split(',', $1) ) @eg;

$a =~ s@,@#@g;

# serve later
$a =~ s@\a@,@g;

print $a, "\n";

-gleeco

timster

1:25 pm on Jun 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This'll do 100,000 (identical) lines in 17 seconds on my PowerBook:


$_ = 'foo,bar,baz,"foo, bar, baz",blah,"widgets, or so"';

$result ='';
while ($_) {

s/^((?:\"[^\"]*\")¦(?:[^,]*))(,?)//;
$result.= $1 . ($2? '#' :'');
}

DrDoc

3:42 pm on Jun 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



timster, that worked beautifully!
Thanks a ton :)