Forum Moderators: coopster

Message Too Old, No Replies

Cleaning up a string, then continuing to use it

Eliminate everything from a string except letters, #s, _, and spaces

         

sawatkins

5:29 pm on Mar 30, 2007 (gmt 0)

10+ Year Member



I've been coming to this site for a while and there's almost always an answer to my question. This is my first post, though. This one seems really simple.

I want to take a string of text passed through a URL using $_POST or $_GET and remove EVERYTHING except letters, numbers, underscore, and space, and then continue to process it.

Something like this:

$string = $_POST[$string];
$string = eliminateallbadstuff($string);

This is to prevent any XSS-style attacks. I know there are plenty of methods out there to eliminate bad tags and then display the information anyway, but I just want to eliminate all disallowed characters and then keep using the "clean" string (to pass into a MySQL database).

The other key is that it must be efficient. I'm passing multiple (10-15) pieces of information per page, and I should probably scrub them all.

I wonder if the reason I can't find anything on the web is that it's embarrasingly easy.

Gus

jatar_k

5:32 pm on Mar 30, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Welcome to WebmasterWorld sawatkins,

>> I wonder if the reason I can't find anything on the web is that it's embarrasingly easy.

or that it isn't really recommended.

is this get strings you are passing to yourself?
how will you be able to handle cases that don't exist?
why not just send them to a dead end if they feed you bad chars?

sawatkins

5:38 pm on Mar 30, 2007 (gmt 0)

10+ Year Member



Yes, I'm passing most of the strings to myself.

Sometimes I want to stop the process, other times I want to continue.

Sometimes I want to continue because:

1) it's a private site and everything is already secured behind a login
2) the private site is for a client, and it's doubtful they're going to try to hack a site they paid good money for
3) i've already got a bunch of error handling for cases that don't exist, and for other conditions as well, this is just one more piece of the puzzle

jatar_k

5:43 pm on Mar 30, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



fair enough

to be honest, all I can think of is walking the string and for each char if it isn't in your set then just drop it and carry on, might not be fast though

I am lazy when it comes to regex though so I would think there is a better way

sawatkins

5:49 pm on Mar 30, 2007 (gmt 0)

10+ Year Member



How should I do it the lazy way?

jatar_k

5:59 pm on Mar 30, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



here's a good reason why I am not very good at regex, I always find something when I need it. ;)

my search was "removing unwanted characters from a string php"

<?php
$string = "This is some text and numbers 1_2_3_4_5 and symbols!£$%^&";
echo '<p>before: ',$string;
$new_string = ereg_replace("[^A-Za-z0-9 _]", "", $string);
echo '<p>after: ',$new_string
?>

here is something interesting as well
[php.net...]

I know it's PECL but I had never noticed it

sawatkins

6:17 pm on Mar 30, 2007 (gmt 0)

10+ Year Member



Aha! ereg_replace() I knew it was embarassingly simple! Thanks!