Forum Moderators: coopster

Message Too Old, No Replies

Missing something in a basic REGEX

Problem with slashes

         

henry0

2:34 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am trying to create a very basic stop for general text
I suppose that if I disallow the basics (as done) I could be pretty much ok.
However I would like adding / and \ but as special characters if I escape a \ with \
I create an error, any work around?
<?
$str="<a]a<aa>";
if (eregi("\]¦\[¦\{¦\}¦\<¦\>",$str) )
{
echo"NO";
}
else{echo"ok";}
?>
as we know pipe in WebmasterWorld shows as ¦

henry0

3:17 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I do not get it.
My text editor adds <span> etc..
but I am not disallowing < and >
Regardless I cannot insert
so I tried adding the following on the top of the first regex
and it works, allowing my editor included in my CMS to add span
and still catch what it should catch

if(!eregi("<¦>"))
if (eregi("\]¦\[¦\{¦\}",$dir_txt_main) )

What's happening here?

grandpa

3:18 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



comment deleted.

<backing up to punt> :)

jdMorgan

3:30 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do group alternatives in [] work?

Something like:
if (eregi("[\][()<>/\\]",$str) )

(Not sure I escaped all the chars that require it)

Jim

henry0

4:40 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thans jdMorgan
but even with the missing escaped spe char
it lets the unallowed signs passing through

I just realized that without willing to do it I used an unfinished regex (the top one ) that does the job :) and does not call the string
if(!eregi("<¦>"))

I would like understanding what's going on.

jdMorgan

5:47 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I can't claim expertise in PHP regex, although it's very close to PERL and mod_rewrite. But one thing to bear in mind when designing 'security-related' code, is that you should design it to *allow* only what you want to allow, and not design it to disallow what you want to block.

The reason for this is that it is far less likely that an error will remain unnoticed if you specifically *allow* only certain characters. If you forget something while doing this, the consequences are far less likely to create a security vulnerability or go unnoticed.

In other words, start with accepting only "[\w\-]" and add to that if necessary.

Jim

henry0

6:01 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks, I am no Regex expert (understatement) but your concept is a good rule of thmub.

henry0

9:47 pm on Jun 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mea culpa!
php.ini errors were disalowed (production server side)
that why I did not get any errors
I fixed it by checking it out with my test bed.

So now I have still the question: How combining
if(!eregi("<¦>",$dir_txt_main))
and
if (eregi("\]¦\[¦\{¦\}",$dir_txt_main) )

coopster

1:50 am on Jun 18, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



For clarifiation, can you explain the character list that you DO or DO NOT want to allow?

henry0

5:50 pm on Jun 18, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry, did not get the Auto-notification
Allowed:
<, >,@,/, [A-Za-z0-9], and any text punctuation ,. ; "
math characters: + - = :
and only those spe characters: $, %,!,?
any of the above in any order are OK

Specially disallowed {} [] and ¦(pipe) and #,*,^, ~

jdMorgan makes lot of sense even if it would be easier (at least to me!) to simply disallow {} [] and ¦(pipe) and #,*,^, ~
I was not able to built it as it is supposed to be.

Thanks

coopster

3:58 pm on Jun 19, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Building as jd suggested you can simply check for the good characters by negating the character class [php.net]. If you get a hit, you know you have a bad character. The very first character of a character class must be the caret (^) if you want to negate the entire class (see the preg_match() expression below for an example).
$string = 'This is a good string, including these chars, <>@/,. ;+-=:$%!?'; 
//$string = 'This is a bad string, including these chars, {}[]¦#*^~';
$allowedChars = preg_quote [php.net]('<>@/,. ;+-=:$%!?', '/');
if (preg_match [php.net]("/[^A-Za-z0-9$allowedChars]/", $string)) {
print 'string is NOT OK!';
} else {
print 'string is OK!';
}

henry0

8:20 pm on Jun 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That I can understand :)

I like the $allowedchars.
And the use of preg_quote() that I never used before

Thanks again