Forum Moderators: coopster
I want to purify email addresses before inserting them into database. some spammers send illegal characters and i want to remove them from email addresses before inserting them into database, i want to use preg_replace but i am not getting over the right regex for this..
target is to remove (spaces, all other characters except UNDERSCORE "_" and DOT "." )
some email that i got from spammers are like
user name@domain.com (space in user name)
User:name";@domain.com (illegal characters etc)
please help me in writing the correct regex with using preg_replace for this .
thanks in advance
$bad_chars = array("!","#","$","%","^","&","*","(",")","{","}",":",";","'","\","/",">","<","~","`","¦"," ");
$email = str_replace($bad_chars, "", $email);
eelix
or even better from our library
[webmasterworld.com...]
I also wanted to mention that changing user data isn't the preferred method. You should return to the user if there is any char that is out of range.
you take the email
test it against a pattern
no match, back to user for correction
in fact i dont want to return to the user...
i have written a php-xml code for cold fusion already exisiting module which hits my code for 25000 times an hour and my script works in a loop to check emails, so what we do is to eliminate illegal characters from email addresses and throw them in a special array and then after loop ends i pass that to cold fusion script it has handle to email senders so it replies them to check their email syntax like this, i m not aware of cold fusion working after it gets these emails much but i know what my job is to eliminate illegal symbols and throw them in an array and pass to cold fusion script.
anyway, thanks for links on regex.
After skimming through some light reading [ietf.org], it seemed the only characters not allowed in the name portion of an email address seem to be at "@", colon ":", comma ",", and space " ". (The reality of username limitations on the mail systems themselves probably impose varying additional restrictions.)
I recently had a customer with an ampersand in their email address (think john&jane@smith.com) and nearly every application and script in our work flow had to be changed (each in a different way) to accomodate an unusual but allowed character.
so i think it would be a bad idea to facilitate very rare cases like people using these characters in their email addresses and take a real gamble to be hurt..
$bad_chars = array("!","#","$","%","^","&","*","(",")","+","=","[","]","{","}","¦",":","<",">","?","/","\\","~","`");
this should work. I didn't escape the '\'
eelix