Forum Moderators: coopster
I will not allow anyone to recommend anyone else's code for review but you are welcome to recommend any of mine or your own. Once we analyze a given thread or code snippet we can move onto another piece of posted code.
I put this one up first
[webmasterworld.com...] - specifically msg 3
who wants to take a stab at the things I did wrong here and possibly suggest what I could have done differently.
Reference for this thread
PHP Security [webmasterworld.com]
$sql = "select * from usertable where username='" . $_POST['username'] . "'";
$result = mysql_query($sql);
I can see this better now that I understand SQL injection attacks
[webmasterworld.com...]
Something to prevent it
[webmasterworld.com...]
The thought of encrypting a user's password
$userpass = md5($_POST['userpass']);
give me shudders. I know we're assuming HTTPS but it's always best to have several layers of protection. Having $_POST['userpass'] be a client side MD5 hash that was sent over HTTPS, now that's something I can live with. That way, you don't receive a copy of their password at any point.
I have a password management site based on this scheme and it works quite well.
I think what you're looking for though is checking the input of those POST form variables. I supposed, although I'm not positive because I develop primarily in mod_perl, someone could inject SQL statements into them since they do not escape single or double quotes.
For instance, if there were a way to botch the query for the user login, so that
mysql_num_rows($result) == 1
They could conceivably be logged in without a correct user/pass.
so there we go, simple process, I am lazy. The extra time to post functions and extra calls for data cleaning left me with a massive security hole filled tutorial.
so the lines mentioned above are issues
$sql = "select * from usertable where username='" . $_POST['username'] . "'";
using tainted (untrusted) data in a query, bad bad
$userpass = md5($_POST['userpass']);
interesting, client side md5 so the pass is never catchable on our side, an option for sure.
also these lines
$username = $_POST['username'];
$userpass = md5($_POST['userpass']);
$sql = "insert into usertable values('$username', '$userpass')";
also this
$username = $_POST['username'];
$userpass = md5($_POST['userpass']);
$sql = "select * from usertable where username='$username' and password='$userpass'";
again unfiltered data in queries
$_SESSION['username'] = "$username";
I have now moved tainted data to a normally trusted source, sessions are located on the server, the only thing that can manipulate them are my scripts on that server. Now I have stuffed unfiltered user submitted data in there, htey are now tainted.
then there is this
$newip = $_SERVER['REMOTE_ADDR'];
now, we found in the linked to presentation that SERVER can not be trusted. I choose to trust the REMOTE_ADDR, it can be spoofed but I choose that to be acceptable risk. I could have at least checked to see if it was actual patterned like an ip.
so, first I will say that this came from various login scripts that I have written and they do clean all data. I will now flush out some suggestions for what should be done. These are not all that should be done as this is a general tutorial and each system is different. Patterns, logic, and level of protection desired are all things that I can't tell you. They are things that need to be decided by you.
So...
1. usernames have accepted characters, use them. When a user signs up a username we decide on the allowable chars ie alphanum, whatever your set might be. Enforce these same rules everytime they have to enter their password, if the username entered on login isn't only alnum then show them an error.
2. Filter all data, there are classes available for this, you can write your own but it is not a beginner, or maybe even intermediate task, the classes and libraries are out there, use them. I must also say don't blindly use anything, take a look inside, see what it's doing, then decide if it is safe or not. You may actually be better at this than the person who wrote it and if not you will definitely learn something.
>> Something to prevent it
a thought on this Timotheos, that suggestion actually changes user submitted data, bangs a square peg into a round hole as it were. I am not saying this is wrong, I am just saying that if we make the choice to change the submitted data, then we can never get it back. Mostly I would say look at, inspect, analyze the data, if it is not of the expected pattern/type then shoot it back to the user, let them enter something else. This is again a decision we make, if we change data I would also suggest logging the raw data somewhere as well.
I think this covers the majors for that post, being a glutton for punishment I have a second
Timotheos and bennymack, well done on the first one, any comments on that are welcome, if there is nothing else about that first thread I will post a second in a little while.
$sql = "select * from usertable where username='" . addslashes [php.net]($_POST['username']) . "'";
Note that certain databases use different delimiters and addslashes may not always be the recognized solution. For example, MySQL uses the slash and has it's own function for this purpose, mysql_real_escape_string() [php.net]. However, IBM DB2 requires a single quote within a delimited expression to be delimited with two single quotation marks, so you might have to do something like ...
$sql = "select * from usertable where username='" . str_replace [php.net]("'", "''", $_POST['username']) . "'";
>> Something to prevent ita thought on this Timotheos, that suggestion actually changes user submitted data, bangs a square peg into a round hole as it were. I am not saying this is wrong, I am just saying that if we make the choice to change the submitted data, then we can never get it back. Mostly I would say look at, inspect, analyze the data, if it is not of the expected pattern/type then shoot it back to the user, let them enter something else. This is again a decision we make, if we change data
In this case, a search box, I would say it's OK but on some other input like username it does need to go back to the user.
I would also suggest logging the raw data somewhere as well.
I first saw something like this on the Drupal CMS and thought it was a great idea.
if anyone has any other of my code posts they want ripped to shreds feel free to sticky me, I'll post them one at a time and we can go through them
i was thinking that too, though then I thought a little differently
I guess the thing to really understand with this one is something outside of the code there. Is the data/file coming from a trusted source?
If we created the file ourself and it is a one time conversion then we can probably consider it a trusted source. If this same chunk of code was used as a csv parser for client uploads then a lot more validation must be used to ensure that we are going to receive what we want.
>> $goodstuff = "";
just me making absolutely sure that var is empty everytime ;)
I guess the thing to really understand with this one is something outside of the code there. Is the data/file coming from a trusted source?
Yah, that looked like a one time deal.
just me making absolutely sure that var is empty everytime
Does it need to be emptied? It gets overwritten every cycle, right.
when I write them there usually is no loop in the beginning so my initialize line is there
when I add the loop
if appending initialize comes outside
if re using stays in the loop
I don't always leave it in, I write some massive parsing loops and using variable vars I have gotten seriously messed up so it stuck with me, re initialize the whole shootin match at the beginning of every iteration.
for($q=0;$q<$tot;$q++){
if ($_FILES['userfile']['name'][$q] == "") continue;
$num = $num + 1; // $num++
// this whole part is moot if I would
//have made the db table auto-increment
$sql = "SELECT pic_id FROM pics ORDER BY pic_id DESC LIMIT 1";
$result = mysql_query($sql);
while($i = mysql_fetch_array($result)){
$new_id = $i['pic_id'] + 1;
$new_pic = "$new_id.jpg";
}
// end moot
}
you don't check to see if it is actually a jpg
Hmmm... that's interesting. Checking out the manual reveals that it could be a security threat. Some points I can summarize on:
1. Do not use $_FILES['userfile']['type'] to verify it is a jpg. This can be spoofed. Use exif_imagetype [us2.php.net] instead.
2. Do not upload untrusted files into your web tree. This is so that once they've uploaded the file they can't execute it.
$_FILES['userfile']['type'] returns the mime type of the file, if the browser provided this information. An example would be "image/gif".
Whereas exif_imagetype() reads the first bytes of an image and checks its signature.
So you could potentially be checking three things out (which will ensure that you are indeed allowing a file of type image to be uploaded:
Thanks, great topic, lots to learn from.
BTW
A great tip is to move above WWW all DB_conn scripts.
Also why never did I think about HTTPS and login :)
because unless you need it, it is not something that people want to sink money into, good certs aren't that cheap
>> Could you please, post an example of filtered query?
well I see there being a couple steps involved.
1. user submits form
2. we take that submitted data and inspect it for a few things
-a. we only take fields that we are expecting
-b. we look at each piece of raw data to see if it is in the form that we are expecting
-c. we return to the user if a field or fields are not acceptable
-d. we move acceptable data into a local trusted variable
3. we use functions such as mysql_real_escape_string on our fields when constructing our query
4. we execute our query
filtering would only truly be step 1 and 2, 3 and 4 are actualy escaping output as our database is considered an external destination.
filtering may also be referred to as
validating
washing
scrubbing
cleaning
among others
make more sense?
function validate_email($email)
{
// Create the syntax validation regular expression
$regexp = "^([_a-z0-9-]+)(\.[_a-z0-9-]+)*@([a-z0-9-]+)(\.[a-z0-9-]+)*(\.[a-z]{2,4})$";
// Presume that the email is invalid
$valid = 0;
// Validate the syntax
if (eregi($regexp, $email))
{
list($username,$domaintld) = split("@",$email);
// Validate the domain
if (getmxrr($domaintld,$mxrecords))
$valid = 1;
} else {
$valid = 0;
}
return $valid;
}
?>
hehe, then you'd be nuts
more commonly a-z A-Z 0-9 _- also with some type of maximum chars
that would be fine
I would hate to see
' " / \ # & ; %
or combinations of the above, maybe this could be my username
%2527%252Esystem(chr(112)%252Echr(101)%252Echr(114)%252E chr(108)%252Echr(32)%252Echr(45)%252Echr(101)%252Echr(32) %252Echr(34)%252Echr(112)%252Echr(114)%252Echr(105) %252Echr(110)%252Echr(116)%252Echr(32)%252Echr(113) %252Echr(40)%252Echr(41)%252Echr(34))%252E%2527
I added some spaces so it would wrap but that is actually this
',system(perl -e "print q()"),'
that might cause a few problems with some additions
>> so it should by esssence be a filter method
yes
______________________________________
continued here
Peer Code Review - Part 2 [webmasterworld.com]
[edited by: jatar_k at 4:55 am (utc) on July 27, 2005]