Forum Moderators: coopster

Message Too Old, No Replies

how to hyperlink urls and emails?

need help hyperlinking in PHP with regular expressions

         

mikelai

1:56 am on Oct 27, 2009 (gmt 0)

10+ Year Member



i'm building a forum where people can post links and email addresses, and i want to hyperlink both of those. i've written a hyperlink function that looks for a regular expression matching the url and then replaces it with the hyperlink code.

for example, the function will go through a post and hyperlink the following:

[google.com...]
google.com
something.com

but, when it comes to an email address like:

jonathan@something.com

it gets confused and hyperlinks the "something.com" part. if i go through beforehand and match and modify the emails based on a separate regular expression, the original funtion still matches the url regular expression and gets hyperlinked again, resulting in jumbled html code.

i think i need a way to prevent the url regular expression from matching an already hyperlinked email address. or a url regular expression that makes sure there is no @ symbol before the domain.

any ideas on what the proper solution is and how to do it?

thanks!

Zipper

2:47 am on Oct 27, 2009 (gmt 0)

10+ Year Member



I'm not exactly sure what you mean by
if i go through beforehand and match and modify the emails based on a separate regular expression, the original funtion still matches the url regular expression and gets hyperlinked again

If you're regex finds an e-mail can't you just flag it and stop checking for url's?

It would be easier if you could post your function(s).

mikelai

5:14 am on Oct 27, 2009 (gmt 0)

10+ Year Member



okay here's the code, although it might be a little hard to understand:

function hyperlink($text) {

// match http:// or https:// + domain + any url permitted character (assumes encoded properly)
$full_url_reg_ex = "(http(s)?:\/\/)[A-Za-z0-9-.]+([\/A-Za-z0-9-_.~\?&=%#@;:,\+])*";

// match domain ending in most common TLDs + optional (/ + any url permitted character)
$common_tld_reg_ex = "[A-Za-z0-9-.]+(\.com¦\.edu¦\.org¦\.gov¦\.net¦\.biz)([\/]([\/A-Za-z0-9-_.~\?&=%#@;:,\+])*)?";

// match either of the above cases (http://www.google.com OR google.com) -- use "e" modifier to utilize add_http function
$text = preg_replace("/(".$full_url_reg_ex.")¦(".$common_tld_reg_ex.")¦(^".$common_tld_reg_ex.")/ie", "'<a href=\"'.add_http('\\0').'\" target=\"_blank\">\\0</a>'", $text);

return $text;
}

so basically the function works on hyperlinking anything that looks like a url. the problem is that it hyperlinks the domain part of an email address as well. i actually want to hyperlink email addresses separately (using the mailto: syntax), so is there any way i can add something to the function that will treat an email addresses separately? and after the email addresses are hyperlinked, to prevent the resulting code from then being hyperlinked by the function i pasted above?

mikelai

5:15 am on Oct 27, 2009 (gmt 0)

10+ Year Member



okay here's the code, although it might be a little hard to understand:

function hyperlink($text) {

// match http:// or https:// + domain + any url permitted character (assumes encoded properly)
$full_url_reg_ex = "(http(s)?:\/\/)[A-Za-z0-9-.]+([\/A-Za-z0-9-_.~\?&=%#@;:,\+])*";

// match domain ending in most common TLDs + optional (/ + any url permitted character)
$common_tld_reg_ex = "[A-Za-z0-9-.]+(\.com¦\.edu¦\.org¦\.gov¦\.net¦\.biz)([\/]([\/A-Za-z0-9-_.~\?&=%#@;:,\+])*)?";

// match either of the above cases (http://www.google.com OR google.com) -- use "e" modifier to utilize add_http function
$text = preg_replace("/(".$full_url_reg_ex.")¦(".$common_tld_reg_ex.")¦(^".$common_tld_reg_ex.")/ie", "'<a href=\"'.add_http('\\0').'\" target=\"_blank\">\\0</a>'", $text);

return $text;
}

so basically the function works on hyperlinking anything that looks like a url. the problem is that it hyperlinks the domain part of an email address as well. i actually want to hyperlink email addresses separately (using the mailto: syntax), so is there any way i can add something to the function that will treat an email addresses separately? and after the email addresses are hyperlinked, to prevent the resulting code from then being hyperlinked by the function i pasted above?

Zipper

6:05 am on Oct 27, 2009 (gmt 0)

10+ Year Member



Ok, so this code only checks for hyperlinks. To check for e-mails you can do something like,

$email_reg_ex = '/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*\@([a-z0-9])(([a-z0-9-])*([a-z0-9]))+' . '(\.([a-z0-9])([-a-z0-9_-])?([a-z0-9])+)+$/i';
if(preg_match($email_reg_ex, $text)){
// it's an e-mail..
} else {
// check for valid url using your code
}

You might also want to check the filter_var [php.net] function to validate an e-mail address more easily.

mikelai

6:53 am on Oct 27, 2009 (gmt 0)

10+ Year Member



the problem is that $text might have multiple urls and emails in it. if i use an if / else statement, i will only be able to check for one or the other. is there any way i can run a function to convert both urls AND emails in the post?

i could run two different functions in sequence, but then i run into the problem i mentioned before -- that the domain part of the email address gets hyperlinked twice, resulting in garbled html code.