Forum Moderators: coopster
What I have so far (after many sleepless nights) is this (unescaped):
(?<=[^a-z0-9])(KEYWORD)(?=[^a-z0-9])(?=[^>]*<)(?!.*?</a>)
If I leave out the last (?!.*?</a>) it will nest anchor tags. With it, it only links keywords AFTER the last anchor in the search string.
Can anybody help?
Another problem is that if there are angle brackets in the anchor tags, this also causes grief...
Good luck!
I think I got round the problem. It's not pretty but it seems to work well enough for what I'm doing - it goes a little something like this:
Put all anchor text into nonsense <##..##> tags:
$text=preg_replace('/(<a([^>]+)>)(.*?)(<\/a>)/is', "$1<##$3##>$4", $text);
Replace keywords not between <..> tags. <##..##> should be skipped too:
$search = "/(>)([^<]*)([^#a-z]+)($keys)([^#a-z]+)/is"; $text = preg_replace($search, $replace, $text);
$replace = "\$1\$2\$3***\$4***\$5";
Get rid of <##..##>
$text=str_replace("<##", "", $text);
$text=str_replace("##>", "", $text);
That could be done by preg too, but I'm guessing str_replace() might use less resources?
I'm sure it could be improved on but I'm not a programmer :) If anyone has any ideas to improve it I'd be grateful.
I haven't found a specific weakness yet but I'll keep testing as I'm sure there will be flaws. I think it might get screwed up if there are tags in the anchor text.
[1][edited by: coopster at 10:03 pm (utc) on May 24, 2006]
[edit reason] removed url per TOS [webmasterworld.com] [/edit]