Forum Moderators: coopster
<?php
$content = 'Some text with an email@none.com and maybe another address@somewhere.org.';
echo findEmails($content);
function convertAsciiEmail($email)
{
$obfuscatedEmail='';//declare variable
$length = strlen($email);
for ($i = 0; $i < $length; $i++)
{
$obfuscatedEmail .= "&#" . ord($email[$i]); // creates ASCII HTML entity
}
$return = '<a href="mailto:' . $obfuscatedEmail . '">'.$obfuscatedEmail.'</a>';
return $return;
}
function findEmails($string)
{
$pattern = "[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})";
//$pattern = "/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/";
//$pattern = '/^[^\W][a-zA-Z0-9_]+(\.[a-zA-Z0-9_]+)*\@[a-zA-Z0-9_]+(\.[a-zA-Z0-9_]+)*\.[a-zA-Z]{2,4}$/';
//$pattern = '[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})';
//$pattern = '^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$';
preg_match_all($pattern, $string, $split);
foreach ($split[0] as $value)
{
$email_to_find = $value[0];
$string = eregi_replace($email_to_find,convertAsciiEmail($value[0]),$string);
}
return $string;
}
?>
You can see above the various regex variations i have tried to use to locate the email addresses that may be in my $content. I am not even sure at this point if the problem is in the regex string I am using or something else completely. THANKS!
The current error message is:
Warning: preg_match_all() [function.preg-match-all]: Unknown modifier '+' in /htdocs/stuff/email-test.php on line 26
Warning: Invalid argument supplied for foreach() in /htdocs/stuff/email-test.php on line 27
<?php
function get_emails ($str)
{
$emails = array();
preg_match_all("/\b\w+\@\w+[\.\w+]+\b/", $str, $output);
foreach($output[0] as $email) array_push ($emails, strtolower($email));
if (count ($emails) >= 1) return $emails;
else return false;
}
$str = 'Some text with an email@none.com and maybe another address@somewhere.org.';
$emails = get_emails ($str);
function convertAsciiEmail($email)
{
$obfuscatedEmail='';//declare variable
$length = strlen($email);
for ($i = 0; $i < $length; $i++)
{
$obfuscatedEmail .= "&#" . ord($email[$i]); // creates ASCII HTML entity
}
$return = '<a href="mailto:' . $obfuscatedEmail . '">'.$obfuscatedEmail.'</a>';
return $return;
}
//print_r ($emails);
foreach($emails as $no => $email)
{
echo convertAsciiEmail($email)."<br />";
}
?>
function findEmails($string)
{
$pattern = "[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})";
preg_match_all($pattern, $string, $split);
foreach ($split[0] as $value)
{
$email_to_find = $value[0];
$string = eregi_replace($email_to_find,convertAsciiEmail($value[0]),$string);
}
return $string;
}
I think the problem is with your regex.
I reworked your code just picking it apart until I could get values to start showing up.
Try this and I bet it will point you in the right way.
#==========================================================
<?php$content = 'Some text with an email@none.com and maybe another address@somewhere.org.';
$content .= 'Some text with an email@none.com and maybe another address2@somewhere2.org.';
echo '<br><b>output from findEmails: </b>'.findEmails($content);
echo '<hr>';
function findEmails($string)
{
echo '<br><b>$string = </b>'.$string;
$pattern = "¦(.+?)@¦";
#$pattern = "[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})";
//$pattern = "/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/";
//$pattern = '/^[^\W][a-zA-Z0-9_]+(\.[a-zA-Z0-9_]+)*\@[a-zA-Z0-9_]+(\.[a-zA-Z0-9_]+)*\.[a-zA-Z]{2,4}$/';
//$pattern = '[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})';
//$pattern = '^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$';
echo '<br><b>$pattern = </b>'.$pattern.' STOP';
preg_match_all($pattern, $string, $split, PREG_PATTERN_ORDER);
echo '<br><b>$split[0] = </b>'.print_r($split[0]);
foreach ($split[0] as $value)
echo '<hr><b>$value = </b>'.$value.' STOP';
{
$email_to_find = $value[0];
$string = eregi_replace($email_to_find,convertAsciiEmail($value[0]),$string);
}
return $string;
}
function convertAsciiEmail($email)
{
echo $email;
$obfuscatedEmail='';//declare variable
$length = strlen($email);
for ($i = 0; $i < $length; $i++)
{
$obfuscatedEmail .= "&#" . ord($email[$i]); // creates ASCII HTML entity
}
$return = '<a href="mailto:' . $obfuscatedEmail . '">'.$obfuscatedEmail.'</a>';
return $return;
}
#==========================================================
I wasn't getting anything to show up as a $value until I made the $pattern just very simple.
Then it started printing those out.
My suggestion is to just start from that and then add to your regex until it does what you want.
Then just take out all that junky debug stuff I added in there. :)
Hope that helps a little.
-Jeff
preg_match_all("/$pattern/", $string, $split);
and you should be ok. Explanation: preg_match takes the first character of your pattern to be the delimiter, in the original code, the first character is [ so preg_match assumes the next ] ends your pattern and treats everything after that (starting with the +) as a pattern modifier, and chokes on the +.
BTW, you can replace all the _a-z0-9 by \w.
BTW2, you don't need to call eregi_replace to make the substitutions, you can call str_replace which is faster and safer since nothing in your string can be treated as a special character.
I have one additional question/problem that i would love some help with. While this will not be a problem for me with my current CMS, I could see a problem for other people. If the incoming string already contains one or more mailto links (either labeled with custom text or with an email address), this script messes it all up. I included an example. I will simply not allow mailto links on the CMS side of my script, but thought that if anyone wants to tackle this, it could be very helpful to the community (and to me). Any thoughts on stripping them automatically before processing these functions?
<?php
$content = 'Some text with an email@none.com and other what@ever.org text. Add a <a href="mailto:mailto@something.com">CLICK HERE</a> link. Start@something.net with an email address. Or end with an email@address.com.';
echo findEmails($content);
function findEmails($string)
{
$pattern = "[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})";
preg_match_all("/$pattern/", $string, $matches);
foreach ($matches[0] as $value)
{
$MatchedEmail = $value;
//echo 'FOUND: '.$MatchedEmail.'<br />';//uncomment for debugging
$string = str_replace($MatchedEmail,convertAsciiEmail($MatchedEmail),$string);
}
return $string;
}
function convertAsciiEmail($email)
{
$obfuscatedEmail='';//declare variable
$length = strlen($email);
for ($i = 0; $i < $length; $i++)
{
$obfuscatedEmail .= "&#" . ord($email[$i]); // creates ASCII HTML entity
}
$return = '<a href="mailto:' . $obfuscatedEmail . '">'.$obfuscatedEmail.'</a>';
return $return;
}
?>
<?php
$content = 'Some text with an email@none.com and other what@ever.org text. Add a <a href="mailto:mailto@link.com">mailto@link.com</a> link. Start@something.net with an email address. Or end with an email@address.com.';
// Example of use
//plain ASCII conversion of email addresses
$Output = findEmails($content, 0, 0);
echo $Output.'<br /><br /><br />'."\n\n";
//Convert ASCII and add mailto links
$Output = findEmails($content, 1, 0);
echo $Output.'<br /><br /><br />'."\n\n";
//Convert ASCII and add mailto links using JavaScript (default)
$Output = findEmails($content, 1, 1);
echo $Output.'<br /><br /><br />'."\n\n";
function findEmails($string, $MakeLink=1, $UseJavaScript=1)
{
//NOTE: if UseJavaScript is turned on (1), all email addresses are forced as links (regardless of "MakeLink" value)
$pattern = "[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})";
preg_match_all("/$pattern/", $string, $matches);
foreach ($matches[0] as $value)
{
$MatchedEmail = $value;
//echo 'FOUND: '.$MatchedEmail.'<br />';//uncomment for debugging
$AsciiEmail = convertToAscii($MatchedEmail);
if($UseJavaScript==1)
{
$NewEmail = jsEmail($MatchedEmail);
}
elseif($MakeLink==1)
{
$NewEmail = '<a href="mailto:' . $AsciiEmail . '">'.$AsciiEmail.'</a>';
}
else
{
$NewEmail = $AsciiEmail;
}
$string = str_replace($MatchedEmail,$NewEmail,$string);
}
return $string;
}
function jsEmail($RawEmail)
{
// We split username and domain into separate strings - otherwise the bot will have no trouble finding the email address
// Split the email into user name and domain
list($user, $domain) = explode('@', $RawEmail);
// Form the href attribute
$mailtouser = "mailto:$user";
$user = convertToAscii($user);//plain user
$domain = convertToAscii($domain);//plain domain
$mailtouser = convertToAscii($mailtouser);//href user (mailto added)
// Generate output
$output = <<<EOT
<script>
document.write('<a href="$mailtouser' + '@');
document.write('$domain' + '"');
document.write('>$user' + '@');
document.write('$domain</a>');
</script>
EOT;
return $output;
}
function convertToAscii($text){
$output = '';
for($i = 0; $i < strlen($text); $i++)
{
$output .= '&#' . ord($text[$i]) . ';';
}
return $output;
}
?>
[edited by: coopster at 1:09 pm (utc) on June 26, 2009]
[edit reason] no urls please TOS [webmasterworld.com] [/edit]
I wrote the following regex:
(<a href="mailto:)([\w-\.]+@([\w-]+\.)+[\w-]{2,4})(">)[\._a-zA-Z0-9@-]*(</a>)
I tested it against the following string:
<a href="mailto:test@none.com">test@none.com</a> and it was found successfully.
I tried to insert this regex into the following function, but it bombs out:
function stripMailtoLink($string)
{
$pattern = '(<a href="mailto:)([\w-\.]+@([\w-]+\.)+[\w-]{2,4})(">)[\._a-zA-Z0-9@-]*(</a>)';
preg_match_all("/$pattern/", $string, $matches);
foreach ($matches[0] as $value)
{
$MatchedLinkString = $value[0];
$MatchedEmail = $value[2]; //i think this should be correct
echo 'FOUND: '.$MatchedLinkString.'<br />';//debugging
}
$output = 'done';//debugging
return $output;
}
HELP! :-) Thanks!
[edited by: coopster at 1:11 pm (utc) on June 26, 2009]
[edit reason] removed url [/edit]
Done. The PHP script below will take an incoming string of text and search out all email addresses, convert them to ascii character codes, and write them into the document using JavaScript. Feel free to use this on your own projects. Good luck.
<?php
$content = 'Some text with an email@none.com and other what@ever.org text. Add a <a href="mailto:test@none.com">TEST</a> link. Include another <a href="mailto:link@example.net">link@example.net</a> address. Start@something.net with an email address. Or end with an email@address.com.';
// Example of use
//plain ASCII conversion of email addresses
$Output = findEmails($content, 0, 0);
echo $Output.'<br /><br /><br />'."\n\n";
//Convert ASCII and add mailto links
$Output = findEmails($content, 1, 0);
echo $Output.'<br /><br /><br />'."\n\n";
//Convert ASCII and add mailto links using JavaScript (default)
$Output = findEmails($content, 1, 1);
echo $Output.'<br /><br /><br />'."\n\n";
function findEmails($string, $MakeLink=1, $UseJavaScript=1)
{
//NOTE: if UseJavaScript is turned on (1), all email addresses are forced as links (regardless of "MakeLink" value)
//start by removing any existing mailto links so we can work with plain text
$string = stripMailtoLink($string);
$pattern = "[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})";
preg_match_all("/$pattern/", $string, $matches);
foreach ($matches[0] as $value)
{
$MatchedEmail = $value;
//echo 'FOUND: '.$MatchedEmail.'<br />';//uncomment for debugging
$AsciiEmail = convertToAscii($MatchedEmail);
if($UseJavaScript==1)
{
$NewEmail = jsEmail($MatchedEmail);
}
elseif($MakeLink==1)
{
$NewEmail = '<a href="mailto:' . $AsciiEmail . '">'.$AsciiEmail.'</a>';
}
else
{
$NewEmail = $AsciiEmail;
}
$string = str_replace($MatchedEmail,$NewEmail,$string);
}
return $string;
}
function jsEmail($RawEmail)
{
// We split username and domain into separate strings - otherwise the bot will have no trouble finding the email address
// Split the email into user name and domain
list($user, $domain) = explode('@', $RawEmail);
// Form the href attribute
$mailtouser = "mailto:$user";
$user = convertToAscii($user);//plain user
$domain = convertToAscii($domain);//plain domain
$mailtouser = convertToAscii($mailtouser);//href user (mailto added)
// Generate output
$output = <<<EOT
<script>
document.write('<a href="$mailtouser' + '@');
document.write('$domain' + '"');
document.write('>$user' + '@');
document.write('$domain</a>');
</script>
EOT;
return $output;
}
function convertToAscii($text){
$output = '';
for($i = 0; $i < strlen($text); $i++)
{
$output .= '&#' . ord($text[$i]) . ';';
}
return $output;
}
function stripMailtoLink($string)
{
$pattern = '¦(<a href="mailto:)([\w-\.]+@([\w-]+\.)+[\w-]{2,4})(">)[\._a-zA-Z0-9@-]*(</a>)¦';
preg_match_all($pattern, $string, $matches, PREG_SET_ORDER);
foreach ($matches as $value)
{
$MatchedLinkString = $value[0];
$MatchedEmail = $value[2];
$string = str_replace($MatchedLinkString,$MatchedEmail,$string);
}
return $string;
}
?>
[edited by: coopster at 1:13 pm (utc) on June 26, 2009]
[edit reason] no personals please TOS [webmasterworld.com] [/edit]