Forum Moderators: coopster
I am trying to setup an array of words and check a chunk of text against it, I always struggle with what to use where, here's what I have so far
$notallowed = array("dog", "cat");
using
if (in_array("$message", $notallowed))
{
$result = "Bad word found";
}
else
{
$result = "Looks OK";
}
echo "$result";
If I only put dog OR cat it finds the word perfectly, if I put both in it says it's ok, not sure what to use to to search for a word within text as putting the cat jumped over the wall allows it as OK
I think I need to do something on this line but don't know what
if (in_array("$message", $notallowed))
thanks for looking
// class
<?
class CheckText
{
var $bw;// BAD WORDS FILTER
function CheckText()// constructor
{
$badWords = array(
'aaaa',
'sssssss',
'ddddddr'
);$this->badWords=array();
srand ((float)microtime()*1000000);
$censors=array ('$','@','#','*','£','^','!','%','&');
foreach ($badWords as $badWord) {
$badWord = preg_quote($badWord);
$replaceStr='';
$size=strlen($badWord);
for ($i=0;$i<$size;$i++) {
shuffle($censors);
$replaceStr.=$censors[0];
}
$this->badWords[$badWord]=$replaceStr;
}} //end WordFilter
function filter ($text) {
foreach ($this->badWords as $badWord => $replaceStr) {
$text=preg_replace('/'.$badWord.'/i',$replaceStr,$text);
}
return $text;
////////////////////////////
$myths_body = preg_replace('/([\xc0-\xdf].)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 192) * 64 + (ord(substr('$1', 1, 1)) - 128)) . ';'", $myths_body);
$myths_body = preg_replace('/([\xe0-\xef]..)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 224) * 4096 + (ord(substr('$1', 1, 1)) - 128) * 64 + (ord(substr('$1', 2, 1)) - 128)) . ';'", $myths_body);// we del all web related code
$str=$myths_body;
$patterns[0]= '/www./';
$patterns[1]= '/http:/';
$patterns[2]= '/\.com/';
$patterns[3]= '/mailto/';$replace[0]= 'Un-Authorized link';
$replace[1]= 'Un-Authorized HTML';
$replace[2]= 'Un-Authorized extension';
$replace[3]= 'Un-Authorized email';
$myths_body= preg_replace($patterns, $replace, $str);//echo"$myths_body";function fixSlashes($input)
{
return preg_replace('/(\/¦\\\)++/','/',$input);
}
$input=$myths_body;
$myths_body=fixSlashes($input);
/////////////////////////////////
} //end filter
?>
// USAGE on any landing page
<?php
$wordFilter=new CheckText();
$bw=$txt;
$txt=$bw;
$txt = $wordFilter->filter($txt);
if
($txt !=$bw)
{
echo "<h3> The content<p>$txt <p>Contains vulgar words.<br>
Those are replaced by random signs!</h3>
<a href=\"../#*$!#*$!xx.php\"><b>Please, Retry and cccccccccccc</b></a><p>";
exit();
}
?>
$notallowed = array('/dog/', '/cat/');$message = preg_replace($notallowed, "ERROR", $message);
if (strstr($message, 'ERROR'))
{
$result = "Bad word found";
}
else
{
$result = "Looks OK";
}
echo "$result";
It seems to work, is this ok
The real problem with such system (including mine) is that we should find a way to only "consider" a whole word and not part of
What I mean:
add before of after or both before and after any letter/s example dogaa then the system finds dogaa as a bad word
nevertheless dogaa could be a very decent word
other one and you'll get it right away: "passenger" is a correct word but includes three letters that will trigger the bad word script.