Forum Moderators: coopster & phranque

Message Too Old, No Replies

Ban malicious visitors with this Perl Script

Automatically adds banned IPs to .htaccess file

         

Key_Master

5:24 am on Jun 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, here's a script I wrote for the Public Domain. It took me 15 minutes to write so it's not the best one ever written but it will do the job.

#!/usr/local/bin/perl

# Name this script trap.pl, upload it in ASCII mode to your cgi-bin and set the file permissions to CHMOD 755.

# This is the only variable that needs to be modified. Replace it with the absolute path to your root directory.
$rootdir = "/home/www/your_root_directory";

# Grab the IP of the bad bot
$visitor_ip = $ENV{'REMOTE_ADDR'};
$visitor_ip =~ s/\./\\\./gi;

# Open .htaccess file
open(HTACCESS,"".$rootdir."/\.htaccess") ¦¦ die $!;
@htaccess = <HTACCESS>;
close(HTACCESS);

# Write banned IP to .htaccess file
open(HTACCESS,">".$rootdir."/\.htaccess") ¦¦ die $!;
print HTACCESS "SetEnvIf Remote_Addr \^".$visitor_ip."\$ ban\n";
foreach $deny_ip (@htaccess) {
print HTACCESS $deny_ip;
}
close(HTACCESS);

# Close
print "Content-type: text/html\n\n";
print "<html>\n";
print "<head>\n";
print "<title>Access Denied!</title>\n";
print "<meta name=\"robots\" content=\"noindex,nofollow\">\n";
print "</head>\n";
print "<body>\n";
print "<p><b>Access Denied!</b></p>\n";
print "</body>\n";
print "</html>\n";
exit;

Replace (or create) your .htaccess file with the following text. For the sake of simplicity and usability, this .htaccess file doesn't use the deny from 0.0.0.0 type of entries. All banning should be done using the SetEnvIf directive. Make sure they appear at the top of the .htaccess file and before the following text.

<Files ~ "^.*$">
order allow,deny
allow from all
deny from env=ban
</Files>

[big]Here are a couple of simple ways to auto ban an IP.[/big]

1. As a simple hyperlink:

http://www.yourdomain.com/cgi-bin/trap.pl

2. Create a blank page (can be any name) and use SSI to engage the script. Simply place the following line in the HTML of the blank page you create.

<!--#include virtual="/cgi-bin/trap.pl" -->

[big]How it works.[/big]

Every time a user hits trap.pl their IP is written to the top of the .htaccess file. Your .htaccess file will begin to look something like this:

SetEnvIf Remote_Addr ^65\.29\.81\.103$ ban
<Files ~ "^.*$">
order allow,deny
allow from all
deny from env=ban
</Files>

This script will not overwrite previously banned IP's. In other words, the IP ban is permanent. The <Files></Files> portion of the .htaccess file lets your server know that IP is banned from visiting any portion of your site.

Warning, be prepared! You must be able to access your .htaccess file in the very likely event you ban yourself.

Enjoy!

incywincy

7:56 am on Jun 6, 2002 (gmt 0)

10+ Year Member



hi km,

i presume that the only way to deploy this mechanism is to build it into a page that isn't accessible from your website (ie an orphaned page) and include it in your robots.txt to try and trap bad spiders.

otherwise you'll be banning each and every visitor (including good bots) that ever visit that page won't you?

Bluestreak

12:54 pm on Jun 6, 2002 (gmt 0)

10+ Year Member



Thanks for the help Key_Master!

I gave it a try, but unfortunately I got a 500 error when I try to run the script. Dont know why. I have my permissions set correctly. i tried Chmodding the htaccess file to 777 thinking it might be due to the script not being able to write to htaccess, but that's no go either. Wonder what it could be?

Key_Master

5:18 pm on Jun 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's probably the way this forum formats this character, ¦. Replace all of them with the proper symbol.

Also strip the trailing spaces from each line and save the file in a UNIX, DOS format if you can before you upload it to the server.

DrDoc

1:07 am on Jun 7, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



.. and make sure the path to Perl is correct ;)

Bluestreak

3:13 am on Jun 7, 2002 (gmt 0)

10+ Year Member



Ok, finally got home from work :D I followed your instructions by replacing the ¦ in my editor, got rid of the spaces, uploaded and it was finally functional. Weeeeeee!!!

BTW, Key i noticed you went to my site to access trap.pl, but it's not in my cgi-bin directory. :D

Here's what I'm going to try. Im going to rename trap.pl to ooooooohhh, say FormMail.pl, and place it in my root directory, while making it expressly forbidden for access in my robots.txt file. From what I could tell it only forbids access to the particular directory the file is found in, not to the whole site. So Im assuming in order to prevent access to the whole site the best thing would be to place it in root.

I'll let you know if I have problems, but so far I really like what you've done. Thanks for the help!

Bluestreak

3:01 pm on Jun 7, 2002 (gmt 0)

10+ Year Member



After playing around with the script, I noticedthe following...

Apparently it doesnt ban the offending bot from the overall site but from the directory the trap.pl is in. So I tried to place it in the root directory, and while I can see the "Access Denied" when attempting access, it doesnt seem to write the offending IP address to the htaccess file in the root, so Im still free to surf anywhere in the site. I tried chmodding the htaccess to 666 and 777, but it still doesnt write the IP address in the file. The weird thing is it writes to the htaccess just fine in a subdirectory. Im probably doing something wrong (Actually Im ALWAYS doing something wrong) :D

mdharrold

2:10 am on Jun 8, 2002 (gmt 0)

10+ Year Member



Bluestreak, check to be sure you set the $rootdir correctly. It sounds like you have an additional sub-directory in there.

Bluestreak

2:42 am on Jun 8, 2002 (gmt 0)

10+ Year Member



Yep you're right, I did have an additional subdirectory in there. Man I am such a blithering idiot. :-O

mdharrold

2:49 am on Jun 8, 2002 (gmt 0)

10+ Year Member



If you haven't already done so, you could copy the .htaccess contents of the sub-directory you were pointing to into your rootdir .htaccess to continue to ban the ones you already trapped.

Key_Master

3:21 am on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Bluestreak,

Glad to see this script is working out for you.

To answer incywincy's question, you can hide the link in a 1x1 pixel transparent graphic. See example below. Note the onclick="return false" portion of the HTML. This prevents JavaScript enabled browsers from following the link. I've never had an accidental ban using it.

<a href="/cgi-bin/trap.pl" onclick="return false;" title=
"Spider Trap"><img src="/transparent.gif" width="1" height="1" alt="Spider Trap" border="0"></a>

You can also use Redirects to ban malicious visitors. I'll use formmail.pl as an example. 301 Redirects should be placed after the <Files></Files> portion of the .htaccess file.

Redirect 301 /cgi-bin/formmail.pl http:/www.yourdomain.com/cgi-bin/trap.pl

Here's another example using a fake directory prohibited in robots.txt

Redirect 301 /this_directory_does_not_exist http:/www.yourdomain.com/cgi-bin/trap.pl

It's always best to keep your .htaccess file lean and mean though. Don't get carried away with banning different file/directory requests or it will get bloated and cause your site to slow down.

Bluestreak

5:47 am on Jun 8, 2002 (gmt 0)

10+ Year Member



Thanks Key and mdharrold,

I do have a few more questions though. In order to write to htaccess Ive had to set the permissions to 666. Isnt this a security risk? I always have those files set to 644. Leaving my htaccess set to the evil mark of the devil doesnt put me at ease :-O

Also, with my htaccess file, Ive already got a considerable ban list based on user_agents. How many lines in the htaccess file would be acceptable before it becomes too bloated? I currently have about 15 lines so far... Id really like to know so if I need to I can cut out some lines to slim it down.

mdharrold

6:04 am on Jun 8, 2002 (gmt 0)

10+ Year Member



I have over 100 lines in my .htaccess and have not had it slow down, since I rearranged it, once.

Set at 666 means that anyone can write to it, but only if they were to create a script to do so.

If you want, add the italics to the script:
following open(HTACCESS,"".$rootdir."/\.htaccess") ¦¦ die $!;
chmod (0666, "$rootdir/.htaccess");

following print HTACCESS $deny_ip;
}

chmod (0655, "$rootdir/.htaccess");

Should take care of any security concerns.

Bluestreak

6:57 am on Jun 8, 2002 (gmt 0)

10+ Year Member



I gave it a whirl, but it didnt work. I could be wrong but I dont think a script can set permissions unless you have server privileges.....

Key_Master

4:10 pm on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, my advice is to leave the file permissions alone- it won't make a difference.

This will keep prying eyes away from your .htaccess and .htpasswd files. Place it before the <Files></Files> portion of the .htaccess file.

# Ban .htaccess & .htpasswd requests
SetEnvIfNoCase Request_URI \.ht(access¦passwd)$ ban

Bluestreak

5:26 pm on Jun 8, 2002 (gmt 0)

10+ Year Member



Ok, so leave the permissions at 666? I was always under the impression those permissions provided a significant security risk the way some paranoid geekazoids talk about it :D
Thanks for all the help. I look forward to your work on BanBots!

Edge

11:34 am on Jun 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would sure love to use this script! I can't for saving my sanity get this script to write to my root directory .htaccess . How do I detemine what element I'm missing in my root path? What standard sytax should be considered normal?

Thanks for sharing this script!

incywincy

11:43 am on Jun 14, 2002 (gmt 0)

10+ Year Member



is it risky using single pixel links? are they considered hidden links by google?

also would you have to worry about spiders getting themselves banned when they crawl your site?

Edge

11:57 am on Jun 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been using file called /denied.html to track bad robots. The file is only accesable with a graphic link similar to as described earlier in this thread. I have the file excluded for crawling in my robot.txt and have not had a legitimate robot access /denied.html. This includes google, incy etc. .

Bluestreak

12:55 pm on Jun 14, 2002 (gmt 0)

10+ Year Member



Make sure you have the htaccess file permission set properly so it can write to it. The standard directory should look like this "/home/youraccount/public_html which woud lead to your root directory. That's all you should need. Make sure the script is also in your root directory.

As for single pixel, what I do is set the link color slightly a shade off the actual text color, so people wont think it's a link unless they look very carefully. That way I wont run the risk of a search engine thinking I have hidden links and penalizing me for it, even though i think it's more hype than reality...

Make sure teh script is disallowed form access in your robots.txt file, and you wont ahve to worry about legitimate bots acidentally getting banned. Only those bots that disobey robots.txt will risk getting banned, which is the whole purpose of the script :D

Edge

2:16 pm on Jun 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When you say that the script should in my root directory, you mean the /cgi-bin directory? My perl scripts will not run outside of the /cgi-bin directory.

Bluestreak

2:39 pm on Jun 14, 2002 (gmt 0)

10+ Year Member



Whoops, my mistake, Im always assuming everyone can run a script outside their cgi-bin.

That's a good question, I have it running in my root, but I dont think Ive been able to get it to work when the script is not in the same directory the htaccess file is. If that's true the best you can do is ban the cgi-bin, but thats pointless because you would want your whole site to be banned.

Key_Master, do you have an answer for him? Key would know since he's the author.

Key_Master

5:39 pm on Jun 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Edge, I'm a little busy now so I'll help you out later on this evening. One thing I noticed (about the site in your profile) is your .htaccess file is a little screwy. For example, I'm able to read it (Edge knows what I'm referring to). It's returning Content-type: text/html when it should be text/plain. I'll help you get that file in order too.

Also, (applies to everyone) the .htaccess file doesn't need to have the file permissions changed to work. Leave it as is.

Frank_Rizzo

10:08 am on Jun 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was going to do something similar to automatically ban password cracking attempts.

re: golden eye, wwwwebwhack etc. These are brute force password cracking tools which flood your login page to try and guess valid login names and passwords.

What the above tools use is proxy lists, sometimes running into the hundreds. So if you check your log files you will see 10's of thousands of 401 errors from hundreds of different IP's.

It is not easy banning the lists manually so I was going to do a script that does it automatically. My only concern is the length of the list in .htaccess.

Anyway, the script is fired up from an error trap in the .htaccess:

ErrorDocument 401 /cgi-bin/banbaddie.pl

the banbaddie script logs all 401 errors. If the same IP address is recognised 5 times in 10 minutes then it is added to the banned list. If the IP address has not caused a 401 for at least an hour then it is removed from the list.

This method keeps the length of the .htaccess down and only grows under an attack.

Other ideas I have are to detect multiple 401 errors within a certain time frame and then go into hokey cokey mode.

This is where a script is fired up which renames the .htaccess in the members area every 10 seconds. The reason being is that you want the password cracker to think he has found valid passwords and basically you pi$$ him off.

What happens is you remove the password protection from the members area for 10 seconds, and then apply it for 10 seconds, then remove it for 10 seconds.... This does NOT inconvenience genuine members as once they have authenticated they are not asked to login again. But it does screw up chummy because his cracking tool will fill up with thousands of what he thinks are valid passwords. And if he does get in whilst the shields are down then he only has 10 seconds to get what he wants which is basically not enough time.

I have not completed these scripts yet but what do you reckon so far?

Edge

12:39 pm on Jun 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I got the script work by using the following for my root directory:

$rootdir = "$ENV{DOCUMENT_ROOT}/";

in place of

$rootdir = "/home/www/your_root_directory";

This script is sooooo cool!

Thanks Key_Master