regex not working

Forum Moderators: coopster

Message Too Old, No Replies

regex not working

regex

techtheatre

4:12 am on Feb 4, 2010 (gmt 0)

I am perpetually foiled in my attempts to use regex. This time i have a "template" file that i have read in as a string. I am trying to search and replace all instances in the string that match a pattern: "cell_cccDD" where "c" is a character and "D" is a digit. Here is what i have:

$ReportTemplate = preg_replace('/^cell_[a-z][a-z][a-z]\d\d$/', '0.00', $ReportTemplate);

Some samples I am trying to replace with '0.00' are:
cell_jan14
cell_feb85
cell_apr01
cell_dec33

And here is a snippet of the actual HTML that is contained in the string:
<td align="right" width="75">cell_mar02</td>
<td align="right" width="75">cell_apr02</td>
<td align="right" width="75">cell_may02</td>

My code above is not finding anything. THANKS!

chrisranjana

4:30 am on Feb 4, 2010 (gmt 0)

$str.= '<td align="right" width="75">cell_mar02</td>';
$str .= '<td align="right" width="75">cell_apr02</td>';
$str .= '<td align="right" width="75">cell_may02</td>';
$str = preg_replace('/cell_[a-zA-Z0-9]+/is', '0.00', $str);
print $str;

The above should get you started in the right direction.

techtheatre

4:11 pm on Feb 4, 2010 (gmt 0)

Thanks chrisranjana. Your posting was a bit too general (it found some cells that were labeled using other naming conventions that i do not want replaced)...but it showed me the problem. I should not have specified the beginning (^) and end ($) of the string. I modified it to the following and it works perfectly. Thanks!

preg_replace('/cell_[a-z][a-z][a-z]\d\d/', '0.00', $ReportTemplate);

rocknbil

7:19 pm on Feb 4, 2010 (gmt 0)

That's still a bit convoluted.

I am trying to search and replace all instances in the string that match a pattern: "cell_cccDD" where "c" is a character and "D" is a digit

When you do this

[bla]

you are creating a character class. In the above example, it will match on only the lower case characters b, l, a. Which makes this,

'/cell_[a-zA-Z0-9]+/is'

slighly inefficient. There's no need for A-Z if you use the i modifer, case insensitive.

Another error (which you corrected:) ^ denotes beginning of string, $ at the end of a preg denotes end of string. Your patterns are neither. Exception: when the first character in a class, [^bla], it means anything not these characters.

Returning to the original quote, when you describe it like you did, you just have to follow through your description:

"cell_cccDD" where "c" is a character and "D" is a digit

$regex = 'cell\_[a-z]+\d+';
$replace = '0.00';
$ReportTemplate = preg_replace("/$regex/i", "$replace", $ReportTemplate);

cell = literal text "cell"

\_ = escape the underscore, in the right contexts it has a special meaning. Might be fine without it, but always escape it in a $regex to be sure.

[a-z]+ = character class of range a-z, + means "one or more of these"

\d+ = one or more digits.

This one is a little broad. Let's say you expect a specific number of characters following the underscore, and a specific number of digits after that. Using your examples, let's say it's 3 and 2 respectively.

$regex = 'cell\_[a-z]{3}\d{2}';

Or, at least 3/2 or more,

$regex = 'cell\_[a-z]{3,}\d{2,}';

or, at least 3/2 but not more than 7/6.

$regex = 'cell\_[a-z]{3,7}\d{2,6}';