Forum Moderators: phranque
var text = '<font face="Comic Sans MS, cursive"><span class="rel-huge"><span class="rel-small"><span class="rel-large">Yeah</span></span></span></font><br>';
var font_match = /<span class=("|')rel-[^\1]+\1>\s*(<span class=("|')rel-[^\3]+\3>)((?!<\/span>)*)<\/span><\/span>/gi;
if (font_match.test(text))
alert('1'); [edited by: phranque at 7:21 am (utc) on Jul 24, 2018]
[edit reason] Disable graphic smile faces for this post [/edit]
it's really a general regexp question and not Javascript specific...Maybe yes, maybe no. Javascript is pretty limited in its RegEx functionality: above all, it can't do lookbehinds.
<span class=("|')rel-[^\1]+\1>\s*(<span class=("|')rel-[^\3]+\3>)((?!<\/span>)*)<\/span><\/span>in my text editor it raises an “invalid RegEx” warning at three different points: the two (?!<\/span>)*
lookahead. Even if I sidestep the quotes issue by making two options <span class="rel-[^"]+">\s*(<span class="rel-[^"]">)((?!<\/span>)*)<\/span><\/span>
<span class='rel-[^']+'>\s*(<span class='rel-[^']'>)((?!<\/span>)*)<\/span><\/span>
there's still the * to account for. What is is supposed to do? <span class="rel-blahblah"> (<span class="rel-blahblah">)</span></span>
... which doesn't make sense, because the </span> occurs at exactly the point where the RegEx says that "</span>" must not occur. There needs to be something immediately before the lookahead: <span class="rel-[^"]+">\s*(<span class="rel-[^"]">)(.*(?!<\/span>))<\/span><\/span>
Maybe yes, maybe no. Javascript is pretty limited in its RegEx functionality: above all, it can't do lookbehinds.
in my text editor it raises an “invalid RegEx” warning at three different points: the two
[^\1] [^\3]
and also the mysterious * which doesn't seem to apply to anything but the
(?!<\/span>)*
Incidentally, why is it "rel-[^"]" ? Do you need to exclude classes that start in "rel" with no following hyphen, as well as classes that start in anything other than "rel"?
[edited by: not2easy at 1:52 pm (utc) on Jul 24, 2018]
[edit reason] Disable graphic smile faces for this post [/edit]
Why would those be errors?I'm simply not sure you’re allowed to use a capture in this context. Evidently in SubEthaEdit you’re not; I haven't tried it in other RegEx engines.
So in this case, I want to match anything that's not </span>, until you get to the first matching </span>.On sober consideration, probably not, because it's still saying “match stuff that is not immediately followed by </span>, and then grab a series of </span>”.
Am I correct in understanding that (.*(?!<\/span>)) is the magic trick that does that?
<span class="rel-blahblah"> <span class="rel-blahblah"><span class="rel-blahblah">blahblah</span></span></span>
then this pattern <span class="rel-[^"]+">.*(<\/span>)+
would capture <span class="rel-blahblah"> <span class="rel-blahblah"><span class="rel-blahblah">blahblah</span></span></span>while the pattern with ?
<span class="rel-[^"]+">.*?(<\/span>)+
(dammit, Forums, I SAID “Disable graphic smileys!”) would capture only <span class="rel-blahblah"> <span class="rel-blahblah"><span class="rel-blahblah">blahblah</span></span></span>
I'm simply not sure you’re allowed to use a capture in this context. Evidently in SubEthaEdit you’re not; I haven't tried it in other RegEx engines.
On sober consideration, probably not, because it's still saying “match stuff that is not immediately followed by </span>, and then grab a series of </span>”.
I can't remember if Javascript recognizes the .*? or .+? structure, meaning “stop as soon as you can” (where the RegEx default is to go on for as long as you can).
The difference would be this.
var a = "This is aaaaaaaaaaaaa <a href='http:\/\/example.com/bbbbbbbbbbbb'>teeeeeeeest<\/a>";
// I want to match aaaaaa and eeeeee, but not bbbbbbb
var match = /((?!<.*)((.)\3{2,})(?!>))/g;
while (match.test(a))
a = a.replace(match, '$3$3'); /((?:^|>)[^<>]*?)([^<>])\2{2,}/g
I ended up with \2 rather than \3. Are you allowed to use the open-ended {2,} construction in javascript? If not, say something like {2,20} instead. Then again, you could say (?:^|>) (?:<|>) Are you allowed to use the open-ended {2,} construction in javascript?
var match = /((?:(<|>))[^<>]*?)([^<>])\2{2,}/g;
while (match.test(a))
a = a.replace(match, '$2$2');
# Result
This is aaaaaaaaaaaaa bb'eest a = a.replace(match,
function($match, $1, $2, $3, $4, $5) {
var ret =
'\n' +
'1 - ' + $1 + '\n' +
'2 - ' + $2 + '\n' +
'\n';
return ret;
});
});
# Result
This is aaaaaaaaaaaaa
1 - <a href="http://example.com/
2 - b
">teeeeeeest</a> what does the ^ in a group class represent? Or is it literal?Neither: it’s an anchor. When you wake up tomorrow morning, it will be with a resounding “D’oh!” as you remember that you have known this all along ;)
(?:^|>)
is not even remotely the same as (?:[^>])
which in turn is not the same as (?:[\^>])
When you wake up tomorrow morning, it will be with a resounding “D’oh!” as you remember that you have known this all along ;)
var match = /((?:^|>)[^<>]*?)([^<>])\2{2,}/;
while (match.test(a))
a = a.replace(match, '$1$2$2');