Forum Moderators: phranque
$pattern = <<EOF;
(?:
(
b[i!1y]rd |
c[o0]w
)?
food
|
(
b[a@*]+d+ |
c[a@*]ndy
)?
apple
)
EOF $str = 'my candyapple'; s{(\b|^|[\s.,;:'"]|\$)$pattern(\b|[\s.,;:'"]|$)}
{$1$2corn$+}xgi; $pattern= <<EOF;
(?:
(?<PLACEHOLDER>
b[i!1y]rd|
c[o0]w
)?
food
|
(?<PLACEHOLDER>
b[a@*]+d+|
c[a@*]ndy
)?
apple
)
EOF
$_ = 'my candyapple';
s{(\b|^|[\s.,;:'"]|\$)$pattern(\b|[\s.,;:'"]|$)}
{$1$+{PLACEHOLDER}corn$+}xgi; Hopefully I'll be taking a break from regex after this one!Yah, when I saw the third regex-related post in my Unread list, my first thought was OK, now you’re just trolling :)
(\b|^|[\s.,;:'"]|\$)
(\b|[\s.,;:'"]|$)If I'm understanding it right ... is the issue that some of the pipe-delimited items are capturable strings while the others are anchors? But I think I'm missing something anyway, because in each case \b would subsume everything else in the list. (It also unfortunately includes hyphens, so “foo-bar” would be perceived as two words whether you want it to or not.) ([.,:;“”]*\bblahblah)
if you want to include the punctuation in the capture, else [.,:;“”]*\b(blahblah)
where blahblah is the pattern, assuming it begins with a \w character. Note position of the \b which has to be immediately adjacent to the word. (\w[\p{Alpha}’-]*)((?:</i>)?\p{Punct}*) ?\[\*\* ?(?:error|typo) for ([^\]]+)\]
used in constructing errata lists, where any punctuation adjoining a word needs to be preserved separately :: Yah, when I saw the third regex-related post in my Unread list, my first thought was OK, now you’re just trolling :)
If I'm understanding it right ... is the issue that some of the pipe-delimited items are capturable strings while the others are anchors? But I think I'm missing something anyway, because in each case \b would subsume everything else in the list. (It also unfortunately includes hyphens, so “foo-bar” would be perceived as two words whether you want it to or not.)
Why can’t the dollar sign \$ be included with the punctuation marks? If I’m reading it right, Perl actually supports both of the forms \p{Punct} and [[:punct:]]