Forum Moderators: coopster
The way I want to do it is I want to take out the <a href></a> tags if the link points to any link that contains the filter_string.
the code below should be pretty close to working but doesn`t do anything.
$filter_string = "\ba=*";
$regex = "/<a\s[^>]*href\s*=\s*([\"\']?)(".$filter_string."[^\" >]*?)\\1[^>]*>(.*)<\/a>/siU";
$html = '<a href ="http://www.example.com/s/?&n=701">link1</a>
<a href="http://www.example.com/s/?&n=702">link2</a>
<a href= "http://www.example.com/s/?a=u&n=703">link3</a>
<a href="http://www.example.com/s/sd?704">link4</a>
<a href ="http://www.example.com/s/ef?705">link5</a>
<a href = "http://www.example.com/s/?gt706">link6</a>
';
$replacement_phrase = "\\3";
$html = preg_replace($regex, $replacement_phrase, $html);
echo $html;
I am very new to regex, please help?
the regex above was constructed from:
<snip>
for some reason the second question mark in the first parenthesis doesn`t show up when I post here. so the regex above should have 2 consecutive question marks for it to work.
[edited by: dreamcatcher at 8:48 am (utc) on Nov. 17, 2008]
[edit reason] No urls please! [/edit]
for some reason the second question mark in the first parenthesis doesn`t show up when I post here. so the regex above should have 2 consecutive question marks for it to work.
Are you escaping that question mark? I'm assuming you are stating that the question mark may represent a query string marker? If so, you will want to escape it because in a regular expression it has a special meaning, unless it is inside your character class.
Are you escaping that question mark? I'm assuming you are stating that the question mark may represent a query string marker? If so, you will want to escape it because in a regular expression it has a special meaning, unless it is inside your character class.
(article section below 3. Allow for Missing Quotes):
at: [the-art-of-web.com...]
it mentions that:
"Because we used the U modifier, all patterns in the regexp default to 'ungreedy'. Adding an extra ? after a ? or * reverses that behaviour back to 'greedy' but just for the preceding pattern. Without this, for reasons that are difficult to explain, the expression fails. Basically anything following href= is lumped into the [^>]* expression."
I`m not exactly sure what this means but it seems to be intended to reverse the U modifier for that grouping.
Are you escaping that question mark? I'm assuming you are stating that the question mark may represent a query string marker? If so, you will want to escape it because in a regular expression it has a special meaning, unless it is inside your character class.
I`ve been testing this regex all day and thought that the second question mark may not make a difference, but I later decided to put the second question mark into the code again because I had my doubts, because it seemed that regex was giving different results but I`m not absolutely sure on this.
[edited by: Scooter at 3:38 pm (utc) on Nov. 18, 2008]
$filter_string = "a=";
$regex = "/<a\s[^>]*href\s*=\s*([\"\']?)(\s*.*".$filter_string."[^\" >]*?)\s*\\1[^>]*>(.*)<\/a>/siU";
$html = '<a href ="http://www.example.com/s/ete.php?ei=t&39&n=701">link1</a>
<a href="http://www.example.com/s/ete.php?ei=t&n=702">link2</a>
<a href= " dfa=u&n=703 ">link3</a>
<a href= "http://www.example.com/s/?&u&n=703 ">link32</a>
<a href="http://www.example.com/s/sd?704">link4</a>
<a href ="http://www.example.com/s/a=ef?705">link5</a>
<a href = "http://www.example.com/s/ete.php?ei=t&n=706">link6</a>
';
$replacement_phrase = "\\3";
$html = preg_replace($regex, $replacement_phrase, $html);
echo $html;
any feedback to this regex newbie appreciated, if the regex I`m pursuing is completely wrong please point that out, as I`m finding out that with:
$regex = "/<a\s[^>]*href\s*=\s*([\"\']?)([^\" >]*?)\\1[^>]*>(.*)<\/a>/siU";
there`s nothing in \\2 for some links, its seems to be reacting irregularly..