PHP Peer Code Review

Forum Moderators: coopster

Message Too Old, No Replies

PHP Peer Code Review

security on my mind

jatar_k

5:56 pm on Jun 16, 2005 (gmt 0)

so we start with a library post of mine, I don't get offended, I know there are problems with this and things that I shouldn't have recommended but, hey we all get lazy.

I will not allow anyone to recommend anyone else's code for review but you are welcome to recommend any of mine or your own. Once we analyze a given thread or code snippet we can move onto another piece of posted code.

I put this one up first
[webmasterworld.com...] - specifically msg 3

who wants to take a stab at the things I did wrong here and possibly suggest what I could have done differently.

Reference for this thread
PHP Security [webmasterworld.com]

Timotheos

7:50 pm on Jun 16, 2005 (gmt 0)

First of all I just wanted to say how amazing you are to put your code up for review and criticism. Thanks, and this first one is a classic. Putting user input straight into your sql statement without first verifying it.

$sql = "select * from usertable where username='" . $_POST['username'] . "'";
$result = mysql_query($sql);

I can see this better now that I understand SQL injection attacks
[webmasterworld.com...]

Something to prevent it
[webmasterworld.com...]

bennymack

7:52 pm on Jun 16, 2005 (gmt 0)

I happen to have quite an interest in managing and protecting user login information and for one, I am all for encrypting passwords before they are even sent over the network. While it's a little outside the scope of the code you have posted, it can easily be done with freely available javascripts.

The thought of encrypting a user's password


$userpass = md5($_POST['userpass']);

give me shudders. I know we're assuming HTTPS but it's always best to have several layers of protection. Having $_POST['userpass'] be a client side MD5 hash that was sent over HTTPS, now that's something I can live with. That way, you don't receive a copy of their password at any point.

I have a password management site based on this scheme and it works quite well.

I think what you're looking for though is checking the input of those POST form variables. I supposed, although I'm not positive because I develop primarily in mod_perl, someone could inject SQL statements into them since they do not escape single or double quotes.

For instance, if there were a way to botch the query for the user login, so that


mysql_num_rows($result) == 1

They could conceivably be logged in without a correct user/pass.

jatar_k

8:30 pm on Jun 16, 2005 (gmt 0)

I thought people might be scared of offending me ;)

so there we go, simple process, I am lazy. The extra time to post functions and extra calls for data cleaning left me with a massive security hole filled tutorial.

so the lines mentioned above are issues

$sql = "select * from usertable where username='" . $_POST['username'] . "'";

using tainted (untrusted) data in a query, bad bad

$userpass = md5($_POST['userpass']);

interesting, client side md5 so the pass is never catchable on our side, an option for sure.

also these lines

$username = $_POST['username'];
$userpass = md5($_POST['userpass']);
$sql = "insert into usertable values('$username', '$userpass')";

also this

$username = $_POST['username'];
$userpass = md5($_POST['userpass']);
$sql = "select * from usertable where username='$username' and password='$userpass'";

again unfiltered data in queries

$_SESSION['username'] = "$username";

I have now moved tainted data to a normally trusted source, sessions are located on the server, the only thing that can manipulate them are my scripts on that server. Now I have stuffed unfiltered user submitted data in there, htey are now tainted.

then there is this
$newip = $_SERVER['REMOTE_ADDR'];

now, we found in the linked to presentation that SERVER can not be trusted. I choose to trust the REMOTE_ADDR, it can be spoofed but I choose that to be acceptable risk. I could have at least checked to see if it was actual patterned like an ip.

so, first I will say that this came from various login scripts that I have written and they do clean all data. I will now flush out some suggestions for what should be done. These are not all that should be done as this is a general tutorial and each system is different. Patterns, logic, and level of protection desired are all things that I can't tell you. They are things that need to be decided by you.

So...

1. usernames have accepted characters, use them. When a user signs up a username we decide on the allowable chars ie alphanum, whatever your set might be. Enforce these same rules everytime they have to enter their password, if the username entered on login isn't only alnum then show them an error.

2. Filter all data, there are classes available for this, you can write your own but it is not a beginner, or maybe even intermediate task, the classes and libraries are out there, use them. I must also say don't blindly use anything, take a look inside, see what it's doing, then decide if it is safe or not. You may actually be better at this than the person who wrote it and if not you will definitely learn something.

>> Something to prevent it

a thought on this Timotheos, that suggestion actually changes user submitted data, bangs a square peg into a round hole as it were. I am not saying this is wrong, I am just saying that if we make the choice to change the submitted data, then we can never get it back. Mostly I would say look at, inspect, analyze the data, if it is not of the expected pattern/type then shoot it back to the user, let them enter something else. This is again a decision we make, if we change data I would also suggest logging the raw data somewhere as well.

I think this covers the majors for that post, being a glutton for punishment I have a second

Timotheos and bennymack, well done on the first one, any comments on that are welcome, if there is nothing else about that first thread I will post a second in a little while.

coopster

8:32 pm on Jun 16, 2005 (gmt 0)

Even if you were to allow most anything for a username where you might not need to *scrub* the username first, the next obvious step would be to escape any delimiters the user may have injected into the username:

$sql = "select * from usertable where username='" . addslashes [php.net]($_POST['username']) . "'";

Note that certain databases use different delimiters and addslashes may not always be the recognized solution. For example, MySQL uses the slash and has it's own function for this purpose, mysql_real_escape_string() [php.net]. However, IBM DB2 requires a single quote within a delimited expression to be delimited with two single quotation marks, so you might have to do something like ...

$sql = "select * from usertable where username='" . str_replace [php.net]("'", "''", $_POST['username']) . "'";

Timotheos

9:54 pm on Jun 16, 2005 (gmt 0)

>> Something to prevent it
a thought on this Timotheos, that suggestion actually changes user submitted data, bangs a square peg into a round hole as it were. I am not saying this is wrong, I am just saying that if we make the choice to change the submitted data, then we can never get it back. Mostly I would say look at, inspect, analyze the data, if it is not of the expected pattern/type then shoot it back to the user, let them enter something else. This is again a decision we make, if we change data

In this case, a search box, I would say it's OK but on some other input like username it does need to go back to the user.

I would also suggest logging the raw data somewhere as well.

I first saw something like this on the Drupal CMS and thought it was a great idea.

jatar_k

10:44 pm on Jun 16, 2005 (gmt 0)

hmm, I was thinking of this one next
[webmasterworld.com...] msg2

if anyone has any other of my code posts they want ripped to shreds feel free to sticky me, I'll post them one at a time and we can go through them

Birdman

11:00 pm on Jun 16, 2005 (gmt 0)

That one looks good to me. I would say this line is not needed:

$goodstuff = "";

Birdman

jatar_k

11:11 pm on Jun 16, 2005 (gmt 0)

>> looks good

i was thinking that too, though then I thought a little differently

I guess the thing to really understand with this one is something outside of the code there. Is the data/file coming from a trusted source?

If we created the file ourself and it is a one time conversion then we can probably consider it a trusted source. If this same chunk of code was used as a csv parser for client uploads then a lot more validation must be used to ensure that we are going to receive what we want.

>> $goodstuff = "";

just me making absolutely sure that var is empty everytime ;)

Birdman

11:19 pm on Jun 16, 2005 (gmt 0)

If anyone's interested, I have one that can be shredded :)

www.webmasterworld.com/forum88/334-2-15.htm#msg29

PS. Cool thread!

[edited by: Birdman at 11:22 pm (utc) on June 16, 2005]

jatar_k

11:21 pm on Jun 16, 2005 (gmt 0)

funny, i removed a thought about uploads before I posted that last comment ;)

Birdman

11:41 pm on Jun 16, 2005 (gmt 0)

I guess the thing to really understand with this one is something outside of the code there. Is the data/file coming from a trusted source?

Yah, that looked like a one time deal.

just me making absolutely sure that var is empty everytime

Does it need to be emptied? It gets overwritten every cycle, right.

jatar_k

11:46 pm on Jun 16, 2005 (gmt 0)

you're right, habit though

when I write them there usually is no loop in the beginning so my initialize line is there

when I add the loop
if appending initialize comes outside
if re using stays in the loop

I don't always leave it in, I write some massive parsing loops and using variable vars I have gotten seriously messed up so it stuck with me, re initialize the whole shootin match at the beginning of every iteration.

Birdman

11:48 pm on Jun 16, 2005 (gmt 0)

On my hack, here's the first thing that's terrible:

for($q=0;$q<$tot;$q++){

if ($_FILES['userfile']['name'][$q] == "") continue;
$num = $num + 1; // $num++

// this whole part is moot if I would
//have made the db table auto-increment
$sql = "SELECT pic_id FROM pics ORDER BY pic_id DESC LIMIT 1";
$result = mysql_query($sql);

while($i = mysql_fetch_array($result)){
$new_id = $i['pic_id'] + 1;
$new_pic = "$new_id.jpg";
}
// end moot

}

jatar_k

1:04 am on Jun 17, 2005 (gmt 0)

you don't check to see if it is actually a jpg

I have this image in my head of people overwhelmed with the code and thinking "I have no idea what they are talking about, I am way too new to answer in that thread"

don't be shy, questions, anything, this is a group learning experience :)

Timotheos

7:42 am on Jun 17, 2005 (gmt 0)

you don't check to see if it is actually a jpg

Hmmm... that's interesting. Checking out the manual reveals that it could be a security threat. Some points I can summarize on:

1. Do not use $_FILES['userfile']['type'] to verify it is a jpg. This can be spoofed. Use exif_imagetype [us2.php.net] instead.

2. Do not upload untrusted files into your web tree. This is so that once they've uploaded the file they can't execute it.

jatar_k

3:39 pm on Jun 17, 2005 (gmt 0)

I do actually check the extension and type but only because if it is wrong then I don't need to go any farther.

Funny thing, that will probably trap more well intentioned users than those with malicious intent. Always assume an attacker will have evrything right.

coopster

3:49 pm on Jun 17, 2005 (gmt 0)

I think what Timotheos is getting at is something a bit different there, jk.

$_FILES['userfile']['type'] returns the mime type of the file, if the browser provided this information. An example would be "image/gif".

Whereas exif_imagetype() reads the first bytes of an image and checks its signature.

So you could potentially be checking three things out (which will ensure that you are indeed allowing a file of type image to be uploaded:

The filename's extension
The MIME-type
image meta data

jatar_k

4:02 pm on Jun 17, 2005 (gmt 0)

actually you just said it better than I did, that's what i meant, hehe

3 layer checking like you mentioned, thankfully someone knows how to boil down what all of us are blathering about ;)

henry0

12:56 pm on Jun 18, 2005 (gmt 0)

jatar_k
Could you please, post an example of filtered query?
For some reason it does not ring a bell in my mind or
I am not up yet!
It�s got to be something I am/I should be familiar with but I do not make the association in between securing script and "filtered".

Thanks, great topic, lots to learn from.

BTW
A great tip is to move above WWW all DB_conn scripts.
Also why never did I think about HTTPS and login :)

jatar_k

5:57 pm on Jun 19, 2005 (gmt 0)

>> why never did I think about HTTPS and login

because unless you need it, it is not something that people want to sink money into, good certs aren't that cheap

>> Could you please, post an example of filtered query?

well I see there being a couple steps involved.

1. user submits form
2. we take that submitted data and inspect it for a few things
-a. we only take fields that we are expecting
-b. we look at each piece of raw data to see if it is in the form that we are expecting
-c. we return to the user if a field or fields are not acceptable
-d. we move acceptable data into a local trusted variable
3. we use functions such as mysql_real_escape_string on our fields when constructing our query
4. we execute our query

filtering would only truly be step 1 and 2, 3 and 4 are actualy escaping output as our database is considered an external destination.

filtering may also be referred to as
validating
washing
scrubbing
cleaning
among others

make more sense?

mcibor

10:20 pm on Jun 19, 2005 (gmt 0)

What about such function for user/pass filter:

if(!function_exists('my_strip')) { function my_strip($var) {
return str_replace(array("\"", "'", "\\"), "", $var); }}

I use it for all $_POST data (I have free text, but it's not English, so I can disregards apostrophe)

Feel free to comment
Michal

jatar_k

8:29 pm on Jun 20, 2005 (gmt 0)

if your rule is I take everything except " and you have made the decision to alter user submitted data then that would be fine

You could apply more validation to it though.

What are your accepted characters for a username?

henry0

7:05 pm on Jun 21, 2005 (gmt 0)

jatar_k;
Thank you, yes it makes sense
(sorry did not acknowledge earlier, was away for 2 days)

coopster

10:31 am on Jun 22, 2005 (gmt 0)

What are your accepted characters for a username?

What if I were to tell you ... any character?

henry0

11:09 am on Jun 22, 2005 (gmt 0)

This validates email
so it should by esssence be a filter method- Correct?
asides validating domain, the same idea could be applied to filter an username to include only alpha charac.
As is used to validate emails, how does it look? Do I miss an important detail?
<<<
[edit]Forgot to say that I first check if email exists then call the following function[/edit]
<?

function validate_email($email)
{
// Create the syntax validation regular expression
$regexp = "^([_a-z0-9-]+)(\.[_a-z0-9-]+)*@([a-z0-9-]+)(\.[a-z0-9-]+)*(\.[a-z]{2,4})$";

// Presume that the email is invalid
$valid = 0;

// Validate the syntax
if (eregi($regexp, $email))
{
list($username,$domaintld) = split("@",$email);
// Validate the domain
if (getmxrr($domaintld,$mxrecords))
$valid = 1;
} else {
$valid = 0;
}

return $valid;

}

henry0

11:42 am on Jun 22, 2005 (gmt 0)

Back on the certificate topic
I think most of us do have a certificate
Some will use the very affordable Go D and other similar one or go with VeriS if money is not an issue and name recognition is important.
However most hosts do provide a shared certificate so I think that it is really an option to consider and offer HTTPS

jatar_k

4:16 pm on Jun 22, 2005 (gmt 0)

>> What if I were to tell you ... any character?

hehe, then you'd be nuts

more commonly a-z A-Z 0-9 _- also with some type of maximum chars

that would be fine

I would hate to see

' " / \ # & ; %

or combinations of the above, maybe this could be my username

%2527%252Esystem(chr(112)%252Echr(101)%252Echr(114)%252E chr(108)%252Echr(32)%252Echr(45)%252Echr(101)%252Echr(32) %252Echr(34)%252Echr(112)%252Echr(114)%252Echr(105) %252Echr(110)%252Echr(116)%252Echr(32)%252Echr(113) %252Echr(40)%252Echr(41)%252Echr(34))%252E%2527

I added some spaces so it would wrap but that is actually this

',system(perl -e "print q()"),'

that might cause a few problems with some additions

>> so it should by esssence be a filter method

yes

______________________________________

continued here
Peer Code Review - Part 2 [webmasterworld.com]

[edited by: jatar_k at 4:55 am (utc) on July 27, 2005]