Forum Moderators: coopster

Message Too Old, No Replies

Trying to detect non-alphanumeric characters in a string

using preg_match

         

MrSpeed

3:05 am on Aug 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a column in a database that I want to display only if it containes letters, numbers and dashes. I don't want to display the name if it contains characters like "©", ">", accented letters, umlauts etc...

I have tried a few variants of the following but nothing seeems to work.


if(!preg_match('[^A-Za-z0-9]+', $line[name])){
print $line[name];
}

Any ideas what's wrong?

Thanks,

dcrombie

5:03 am on Aug 2, 2004 (gmt 0)



I think you're being too negative ;)

if(preg_match('/[a-z0-9]+/i', $row)){  
print $row;
}

coopster

1:51 pm on Aug 2, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



LOL. Actually, although you are being too negative, that's not what caused your issue. Being two negative returns the correct results, but only when the expression is enclosed in the pattern delimiters [php.net], a forward slash (/) as dcrombie has shown in the example.

MrSpeed

2:09 pm on Aug 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah but don't two wrongs make a right? :)

I think I got it it now. My strings had characters like "/" and "-" so my regex looks like


if(preg_match("/[^a-zA-z0-9 -\/]/i", $line[category])){
print "MATCH3".$line[category];
}
else{
print "NOT";
}

Thanks

coopster

2:28 pm on Aug 2, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Remember, as per documentation in the link given, Any character can be used for delimiter as long as it's not alphanumeric or backslash (\). If the delimiter character has to be used in the expression itself, it needs to be escaped by backslash (as you have already demonstrated).

Don't forget that PHP offers us preg_quote [php.net] as well.

dcrombie

2:41 pm on Aug 2, 2004 (gmt 0)



MrSpeed you might have a problem with that regex (above). Within the [] if you want a - to be treated as an actual - it has to be at the start of the pattern (or after the ^ in this case).

MrSpeed

4:57 pm on Aug 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



MrSpeed you might have a problem with that regex (above). Within the [] if you want a - to be treated as an actual - it has to be at the start of the pattern (or after the ^ in this case).

You may be right. At first it seemed to work fine but using an online tester it won't match if the string is
"/d/jo*ne-doe"

I'll have to play around a bit with your suggestion. In a nutshell I want to display the fields if it only contains letters,numbers,dashes and maybe spaces.

coopster

5:11 pm on Aug 2, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



>>You may be right.

Very much right. The hyphen indicates character range, just like in the earlier part of your pattern,

a-z
. So, in your current pattern, you are telling it to match any letter from a to z, any number from 0 to 9, and any character between space and the slash. Have a look at the ascii table [asciitable.com] and see that you will allow plenty of other characters within that range, including
!"#$%&'()*+,-.
. My tests have shown that the hyphen doesn't necessarily have to be at the beginning of your pattern, but it sure does keep things tidier for you and helps reduce the possibility of you incorporating the hyphen in a range. Both work:
"/[^a-z0-9- \/]/i" 
or
"/[^-a-z0-9 \/]/i"

Update:

Found this in the manual...

If a minus character is required in a class, it must be escaped with a backslash or appear in a position where it cannot be interpreted as indicating a range, typically as the first or last character in the class.