Welcome to WebmasterWorld Guest from 54.159.165.175

Forum Moderators: coopster & jatar k

Bad Words Filter Enhanced

How to not filter valid words

   
3:10 am on Dec 2, 2009 (gmt 0)

5+ Year Member



Hello,

I am trying to enhance my bad words filter. currently I am filtering posts using a badwords array

ie. $badwords = array("bad","words");

and using str_replace to replace the words with * characters.

This is fine up until I have a words like bypass, or grass if I am filtering the word "ass" I get a result of byp***, or gr***.

I have tried to use preg_replace, but am not familiar enough to make the proper regex for this. any help or thoughts would be much appreciated.

Thanks.

4:04 am on Dec 2, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



All you really need to do different with preg_replace for something like this is set a delimiter (most people use /, but I've started using # because I don't have to escape it as often as I do / since there's not as many patterns I try to match with # in them) and then you can use \b which matches a 'boundary' of the word you are searching for...

So, the patterns below match 'WordBoundary'word'WordBoundary'. (Run the test and you'll get it if you don't already.) It's not 'perfect' but will be much better and with a bit of adjusting can be made fairly accurate:

$string='bad words badwords words-bad. words-good wordsgood goodbad bad';
$badwords = array("#\bbad\b#i","#\bwords\b#i");
$cleanString = array("b**","w****");

$cleanedString = preg_replace($badwords,$cleanString,$string);
echo $cleanedString;

It needs to be adjusted a bit to not change the case of the word(s) replaced, because right now it will lowercase every replacement... You'll probably need to use () around the first character and then either \\1 or $1 in the replacement rather than the 1st letter. $1 is preferred, but \\1 is sometimes easier to work with. I'll let you play around with it a bit and see if you can get it working to your liking and specific situation.

5:37 am on Dec 2, 2009 (gmt 0)

10+ Year Member



TheMadScientist is OK.

I have used this expression to delete certain words:

$city = trim(preg_replace("/&\bVicinity\b\bCounty\b\bCounties\b\bArea\b\bCity\b\bThe\b/i", '', $city));

You can use:

$words = array();
$words[] = 'very';
$words[] = 'bad';
$words[] = 'words';

$words = '\b' . implode ('\b\b', $words) . '\b';

$string = trim(preg_replace("/$words/i", '*', $string));

1:51 am on Dec 3, 2009 (gmt 0)

5+ Year Member



Thank you all for input! MadScientist I used your idea, heres how I did it

$words = array("bad","words","here");

$badwords = array();

foreach($words as $badword){

$badwords[] .= "#\b$badword\b#i";

}

$message = preg_replace($badwords, '*censor*', $message);

Note: I used a loop to add the regex because the list is too long to add manually

Thanks and if anyone has any betetr ways, please share!

2:11 am on Dec 3, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Implode, Explode maybe?

$words = array("bad","words","here");
$words = '#\b'.implode('\b#i¦¦#\b',$words).'\b#i';
$words=explode('¦¦',$words);

print_r($words);

 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month