I'm using preg_replace (not preg_match) to highlight the above characters via span pairs.
while I guess this is a good start, would anyone be willing to share a comprehensive pattern that would catch most of the "bad stuff" thrown at html forms?
Thanks to all in advance
PS: some of the "bad" test strings I'm using contain: urls and links surrounded by square brackets.
Longer but simple place to start: accept only what you want and throw everything else away.
Here's one of many [webmasterworld.com] discussions on the topic that will help. (Die on patterns found.) The array mentioned there is an easy way to filter out what you decide you need to keep.
A better one [webmasterworld.com] (second to last post) that I use regularly, which includes some cool bits on email address validation and more importantly logging the input data. This is more useful than you can ever imagine it, it reveals "what they are up to." To use this you'll have to understand functions, and how to pass parameters to them and evaluate the result. It also refers to other functions you'll need to write (exit_prog_error([message]), for example.)