Msg#: 4481161 posted 5:48 am on Aug 2, 2012 (gmt 0)
As the saying goes, one size never fits all. I'm hardly an expert, but this is a start:
You want to insure that there's no malicious data in both user form submissions and $_GET data. How much checking and encoding you do depends a bit on what your script does with the data. For example, if the data is to be stored in a database, you'll want to use a function like mysql_real_escape_string(). addslashes() is probably sufficient if all you're going to do is send the data somewhere by Email.
Further, you should use relatively strong validation on any such data. Email addresses should be in a valid form with no extraneous characters like commas, \n, or \r. Numeric fields should only contain numbers, and both text and numeric data fields should be of a reasonable length for the purpose of each field.
I'm sure others will have more specific suggestions for you.
Msg#: 4481161 posted 11:58 am on Aug 2, 2012 (gmt 0)
There's much more to it.
htmlentities is a start, but you need to be careful to use it properly: with the right encoding and suppressing the right quotes for the area you're in when outputting it.
e.g. the HTML: <a href="AAA" rel="BBB">CCC</a>
CCC: is doesn't need quotes to be escaped and if you do you increase the problems you'll have when reading the source code
BBB: it needs double quotes to be escaped or your data might start adding more attributes
AAA: it should be urlencoded, not just htmlencoded e.g. a space should be encoded as "%20" or as "+".
Moreover if you're going to output things like XHTML5 you can't use htmlentities as you are only allowed to use & < > " and ' all the others must be UTF-8 (I did not check if other character encodings are still allowed - I only use UTF-8 anymore)
escaping stuff from your database.
I prefer not to do this at all.
How? First: Use mysqli calls instead of the (should be) obsolete mysql ones. And then use prepared statements whenever it's possible instead of trying to fix things with escaped characters. That way the database knows where user input is and it will not confuse data for statements.
This is the really big one: consider all user input (i.e. all coming from the browser) to be TAINTED, even if it was filtered on the client side. You need to clean it to a point where you *know* it is valid before you use it for anything.
There are 2 tactics:
whitelist: you make sure that everything in the input is known good (so all characters are ok, the combinations are ok, the range (length, min, max) is ok etc. This is the hard approach but the most secure. It is relatively easy if you are expecting e.g. a response froma drop down list: you know the possible values very well. It is especially hard if you're going to have to do this on e.g. a review of a hotel: the possible inputs are much wider and free.
blacklist: you make sure all that can harm you is removed. The tricky bit is that you need to know what can harm you. E.g. a "." in string that's going to be used as a filename (e.g. logo.png) will not be considered harmful. But that same dot will be quite a different story when used as "../../../../etc/passwd". Still in many cases blacklisting is all we can achieve and hence a very tricky thing that yields many "fatal" mistakes into production environments.
In practice one often combines the two tactics (and my whitelisting definition isn't 100% pure already due to this).
The top 10 of mistakes (learn from what others failed to do right): https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project
If you look for help not having to implement your own libraries: https://www.owasp.org/index.php/Category:OWASP_Enterprise_Security_API It's available for PHP (although I've not used it myself (yet)):