Forum Moderators: coopster
[edited by: eelixduppy at 5:26 pm (utc) on Feb. 21, 2008]
[edit reason] added link to thread [/edit]
As are you saying that you want -
this contains a keyword here and keyword here
to turn into
this contains a <a href="/keyword">keyword</a> here and <a href="keyword">keyword</a> here
but you want
this is a google like [google.co.uk...] keyword &btnG=Google+Search&meta= this
not to end up as
this is a google like [google.co.uk...] <a href="/keyword">keyword</a> &btnG=Google+Search&meta= this
?
Or are you after something different?
i want keyword to be replaced by a link only if it is not in or between A tags. shouldn't touch HREF and shouldn't touch the anchor text. just the keywords that are out in the rest of the text. also, it could be possible for the keyword to be wrapped in like B or I tags ... those are ok. i just don't want to be placing a link within a link, and i don't want to alter the HREF tags that may contain the keyword.
i used this script for the initial replace:
<?php
/**
* Perform a simple text replace
* This should be used when the string does not contain HTML
* (off by default)
*/
define('STR_HIGHLIGHT_SIMPLE', 1);
/**
* Only match whole words in the string
* (off by default)
*/
define('STR_HIGHLIGHT_WHOLEWD', 2);
/**
* Case sensitive matching
* (off by default)
*/
define('STR_HIGHLIGHT_CASESENS', 4);
/**
* Overwrite links if matched
* This should be used when the replacement string is a link
* (off by default)
*/
define('STR_HIGHLIGHT_STRIPLINKS', 8);
/**
* Highlight a string in text without corrupting HTML tags
*
* @author Aidan Lister <aidan@php.net>
* @version 3.1.1
* @link [aidanlister.com...]
* @param string $text Haystack - The text to search
* @param array¦string $needle Needle - The string to highlight
* @param bool $options Bitwise set of options
* @param array $highlight Replacement string
* @return Text with needle highlighted
*/
function str_highlight($text, $needle, $options = null, $highlight = null)
{
// Default highlighting
if ($highlight === null) {
$highlight = '<strong>\1</strong>';
}
// Select pattern to use
if ($options & STR_HIGHLIGHT_SIMPLE) {
$pattern = '#(%s)#';
$sl_pattern = '#(%s)#';
} else {
$pattern = '#(?!<.*?)(%s)(?![^<>]*?>)#';
$sl_pattern = '#<a\s(?:.*?)>(%s)</a>#';
}
// Case sensitivity
if (!($options & STR_HIGHLIGHT_CASESENS)) {
$pattern .= 'i';
$sl_pattern .= 'i';
}
$needle = (array) $needle;
foreach ($needle as $needle_s) {
$needle_s = preg_quote($needle_s);
// Escape needle with optional whole word check
if ($options & STR_HIGHLIGHT_WHOLEWD) {
$needle_s = '\b' . $needle_s . '\b';
}
// Strip links
if ($options & STR_HIGHLIGHT_STRIPLINKS) {
$sl_regex = sprintf($sl_pattern, $needle_s);
$text = preg_replace($sl_regex, '\1', $text);
}
$regex = sprintf($pattern, $needle_s);
$text = preg_replace($regex, $highlight, $text);
}
return $text;
}
?>
this script doesn't touch the HREF attribute of the A tag. then i used the 2nd step from the script i found here to remove the links that were made inside anchor text.
$step2 = preg_replace('{(<a[^<]*)<a href="link">keyword</a>([^<]*</a>)}', '$1keyword$2', $step1);
[edited by: eelixduppy at 9:28 pm (utc) on Feb. 21, 2008]
[edit reason] disabled smileys [/edit]
<?php
$test['ok'] = 'this contains a keyword and another keyword';
$test['not_ok'] = 'this contains a linked <a href="http://google.co.uk/">keyword</a> here';
$keyword = 'keyword';
foreach ($test as $subject) {
$pattern = "%($keyword)(?!.*</a>)%i";
$replacement = '<a href="/$1">$1</a>';
$out = preg_replace($pattern, $replacement, $subject);
echo "ALL: $out<br />\n";
}
?>
This is not perfect as it is only checking for </a> after the keyword. However this should get you what you want most of the time.
Your solution is more definite, just longer...not that there is any problem with that ;)
[edited by: eelixduppy at 11:57 pm (utc) on Feb. 21, 2008]
[edit reason] disabled smileys [/edit]
<a href="http://www.keyword.com">keyword</a> gets turned into
<a href="http://www.<a href="/keyword">keyword</a>">keyword</a>
but this should give me a very good start at working towards something that covers everything. thanks for the help, i appreciate it.
$test['not_ok'] = 'this is a linked <a href="http://google.co.uk/keyword">keyword</a> here';
However I guess that the lookahead block should really be (?!.*?</a>) as otherwise the .* will consume everything up to the last </a> in the input, so that could mean that a very large chunk of input may be missed.
[edited by: eelixduppy at 7:33 pm (utc) on Feb. 22, 2008]
[edit reason] disabled smileys [/edit]