Forum Moderators: coopster
I have a form used to submit content to a database. I don't want any HTML code submitted to the database. Style formatting will be handled through BBCode-style tags (i.e. [b] instead of <b>). The form is handled via PHP.
Based on your experience, what would you consider the best (from the POV of both a coder and an end-user):
1) Convert all HTML entities to harmless ascii characters. This means if the user types HTML into the form and submits it, the content will go into the database and when it's retrieved the HTML will get displayed as harmless text (i.e. it will not get rendered by the browser).
2) Check for the presence of HTML code before the content is submitted to the database. If HTML code is found, throw an error and force the user to clean up the content (i.e. remove HTML) before allowing a Submit to the database. If the user wants to include HTML, force the user to wrap the desired block in [code] tags.
I am leaning toward option #1 because it seems easier. Basically, if a user tries to use HTML in their submission, it's accepted, but converted into harmless characters. The down-side to this is that the raw HTML will appear as text on the screen when the content is pulled from the database. Option #2 is more to my liking, but the task of checking all content for any HTML tags seems daunting.
Also, take a look at the strip_tags() [php.net] php function. It will work well if all the HTML code is valid with proper closing tags etc.
Just a thought.