Forum Moderators: coopster

Message Too Old, No Replies

Finding my Domain Name within a Preg Replace Function

Don't know where to start.....

         

trillianjedi

11:36 am on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't even know how to approach this one, so I'm still at square one.

I have a phpbb2 forum within which I allow people to post links, but I sanitise them via both a redirect and a no-follow tag.

The basic preg_replace (this is part of the "make_clickable" function) is:-

$ret = preg_replace("#(^¦[\n ])([\w]+?://[^ \"\n\r\t<]*)#is", "\\1<a href=\"http://www.example.com/redirect.php?url=\\2\" target=\"_blank\" rel=\"nofollow\">\\2</a>", $ret);

I don't want to sanitise the link or go via the redirect if it's a link to my own domain, www.example.com.

So something like this:-


if (NOT mydomain) {
$ret = preg_replace("#(^¦[\n ])([\w]+?://[^ \"\n\r\t<]*)#is", "\\1<a href=\"http://www.example.com/redirect.php?url=\\2\" target=\"_blank\" rel=\"nofollow\">\\2</a>", $ret);
} else {
$ret = preg_replace("#(^¦[\n ])([\w]+?://[^ \"\n\r\t<]*)#is", "\\1<a href=\"\\2\" target=\"_blank\">\\2</a>", $ret);
}

Where I'm stuck is how to determine if it's mydomain, given that the string this function is parsing is the entire post. If, for example, I did a REG_EX of this string for "www.example.com", then two links, one with my domain in and one without would both skip the sanitisation and redirect script.

I need to look at every single domain that this preg_replace actually does something on, to decide, on a case by case basis whether it's my domain or not.

I hope that makes sense. Something tells me I need to re-write the entire function, but I'm just not sure of the best way to approach it?

Thanks!

TJ

jatar_k

12:31 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I don't if you can do a NOT in the very same regex

my gut tells me you will probably need to chunk up the post based on links then analyze the links, do what needs to be done to each, then put the whole thing back together

panos

12:37 pm on Mar 25, 2007 (gmt 0)

10+ Year Member



This is what i use to make urls clickable. For external urls rel="nofollow" is applied. I hope you will find it useful

function make_clickable($text, $maxurl_len = 70)
{
if (preg_match_all('/((ht¦f)tps?:\/\/([\w\.]+\.)?[\w-]+(\.[a-zA-Z]{2,4})?[^\s\r\n\(\)"\'<>\,\!]+)/si', $text, $urls))
{
$offset1 = ceil(0.65 * $maxurl_len) - 2;
$offset2 = ceil(0.30 * $maxurl_len) - 1;

foreach (array_unique($urls[1]) AS $url)
{
//rel=nofollow for external links
$purl=@parse_url($url);
$phost=strtolower($purl['host']);
if ($phost=='www.example.com')
{
$rel='';
}
else
{
$rel='rel="nofollow"';
}

if ($maxurl_len AND strlen($url) > $maxurl_len)
{
$urltext = substr($url, 0, $offset1) . '...' . substr($url, -$offset2);
}
else
{
$urltext = $url;
}

$text = str_replace($url, '<a href="'. $url .'" '.$rel.' title="'. $url .'">'. $urltext .'</a>', $text);
}
}

return $text;
}

trillianjedi

12:39 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I hope you will find it useful

Hell yes!

Panos, you're an absolute star - thank you.

TJ

jatar_k

12:52 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



nice panos

panos

12:59 pm on Mar 25, 2007 (gmt 0)

10+ Year Member



You may want to change the regex to meet your needs.

This function makes clickable only the urls that start with 'http' and recognizes your domain only if it starts with 'www'.

It will not be hard to change it to include other cases (lazy urls etc).

trillianjedi

1:14 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You may want to change the regex to meet your needs.

Yes I'm going to have to do that. But my biggest problem was not knowing where to start... and you've solved that++. Adam was right, it's a case of "chunking up the code", but your code snippet is really useful too. Tweaking it to my requirements will be very easy. I can follow it 100%.

Thanks again.

TJ