Forum Moderators: coopster
After scouring the internet for tutorials and examples I finally came up with a regex that deletes the opening tag (ie '<a name="_Toc437192468">') however in reality in PHP it leaves the first '<' ie it deletes 'a name="_Toc437192468">'
so <a name="_Toc437192468"><h3>Heading</h3></a> becomes
<<h3>Heading</h3></a>
my regex is
$D = preg_replace('<a name="_Toc[^"\>]*">', '', $D);
paraphraised:
Find '<a name="_Toc' + (ANY CHARACTER THAT IS NOT '">' ) + '">"
What is wrong with this regex? Why is it ignoring the first "<" within the tag
Also if somebody could give me some pointers as to how I find the first </a> after each of these and delete them I would really appreciate it (I admit im stumped and for the most part regex's seem to be going over my head).
Thanks
- Ryan
The first character is more than likely being recognized as a delimiter [php.net]. Try adding a delimiter to your pattern and see where you get...
Bonusbana, that was great, the first regular expression worked very well. However the second one does not seem to work, I will keep trying to tinker with it. I think my problem was that I was basing my regular expresion on ones that I was writing and testing within dreamweaver, obviously dreamweavers regex's are not as powerful as the PHP ones.
Anyway it is important for me to learn Regex's so that I do not have to bother you guys about it again :). If someone does not mind could you please have a look at my paraphraising below and tell me if my understanding is correct.
First Regex:Open Deliminator /
<a name ="
_Toc
Any character other than "
One or more times
">
End Deliminator /
Second Regex:
Open Deliminator /
<a name ="
_Toc
Any character other than "
One or more times
">
Sub Pattern Start
Any character other than <
One or more times
Sub Pattern End
</a>
End Deliminator /
also within the second regex (the one that does not work, for whatever reason) why is the replace string \1?
Thanks for all the help
- Ryan
$D = preg_replace("/<a name=\"_Toc[^\"]+\">([^<]+)<\/a>/", "\\1", $D);
OR
$D = preg_replace("/<a name=\"_Toc[^\"]+\">([^<]+)<\/a>/", "$1", $D);
The regex engine stores matches indicated between the ( and ) characters and assigns a variable name based on the order in which they appear in the pattern.
The first match is \\1 or $1
the second match is \\2 or $2 and so on
I'm not sure what the difference between the 2 are (or even if using \1 is correct/incorrect), but everything I've read uses \\1 and $1 as the format for replacement matches.
Here is an example of what I am looking at,
<a name="_Toc437192531"><span style="font-weight: bold;">Article 1</span></a>
Obviously the span will be removed with the existing Regex's.
thanks
- Ryan
Here's an untested attempt
$D = preg_replace("/<a name=\"_Toc[^\"]+\">([.]*)<\/a>/", "$1", $D);
I think, in theory, it should strip all the toc link tags out regardless of any nested tags. I've had trouble with matches using the period character, so here's an alternate just in case:
$D = preg_replace("/<a name=\"_Toc[^\"]+\">([\w?\W?\s?\S?]*)<\/a>/", "$1", $D);
(edit: oops, just realised you said that the span tags will be removed... I'm not sure why it won't work then?)
$D = preg_replace("/<a name=\"_Toc[0-9A-Za-z-]+\">([\w\s\S\W]+)<\/a>/i", "$1", $D);
added a case insensitive switch, and allowed only alpha numerics after the _Toc text (not sure if that other expression may have been causing it to fail).
Wish I had a test server available before posting them, apologies.
No need for appologies, you are already helping more than I could possibly have hoped for. If you lived in NZ I would owe you a beer :)
Anyway, for some reason it also does not work, It strangely works on only some (but not all of the _tocs
an example of one that it did not work on is this:
<a name="_Toc437192532"><strong>Article 2</strong></a>
an example of one that it did work on is:
<a name="_Toc437192531"><strong>Article 1</strong></a>
There is no discernable pattern of where it is working and where it is not.
I might just have to give up and do it with code I think, reguardless of whether or not this works I have been given a good starting point for learning and eventually mastering Regex's
Thanks
- Ryan