Forum Moderators: open
Same thing today four straight hits to index from 4 different pages where my link is but 20 mins later the same IP came back and ripped through all main pages taking up to 10 pages a second. Don't know what they are playing at if it really is from them. And I wish I could figure out that spider trap thingy......
# Block bad-bots using lines written by bad_bot.pl script above
SetEnvIf Request_URI "^(/403.*\.html¦/robots\.txt)$" allowsome
<Files *>
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>
My present htaccess file begins like this:
RewriteEngine On
Options -Indexes
Options +FollowSymlinks
Then follows a large number of redirects and RewriteCond's. The final lines of the document are:
RewriteCond %{REMOTE_ADDR} ^65\.102\.17\.(3[2-9]¦[4-6][0-9]¦7[0-1]¦8[89]¦9[0-5]¦10[4-9]¦11[01])$
RewriteRule ^.* - [F,L]
ErrorDocument 404 [mysite.com...]
I am wondering where exactly do I put the spider trap lines and what if any effect this will have on the existing entries and whether it will all 'go together'.
> I am wondering where exactly do I put the spider trap lines and what if any effect this will have on the existing entries and whether it will all 'go together'.
Just put that new code you quoted at the beginning of your .htaccess file. It can actually go anywhere, as long as it's not inserted between a RewriteCond and a following RewriteRule; It's just neater at the top. I note a problem with the order of your Options and RewriteEngine directives, so here's the whole thing in order, with the Options combined as well.
# Block bad-bots using lines written by bad_bot.pl script above
SetEnvIf Request_URI "^(/403.*\.html¦/robots\.txt)$" allowsome
<Files *>
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>
Options -Indexes +FollowSymlinks
RewriteEngine on
# Fri Mar 28 04:59:03 2003 Opera/5.02 (Windows 98; U) [en]
SetEnvIf Remote_Addr ^203\.152\.30\.38$ getout
#
# Block bad-bots using lines written by bad_bot.pl script above
SetEnvIf Request_URI "^(/403.*\.html¦/robots\.txt)$" allowsome
<Files *>
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>
Options -Indexes +FollowSymlinks
RewriteEngine On
If you wish, you can temporarily change the deny line to
deny from env=nevermind
Keep a backup copy of your original .htaccess; If you ban yourself while testing or something else goes wrong, just re-upload the backup .htaccess using FTP, and that will get you running again.
However, after correct installation and set-up, this thing works and works very well.
HTH,
Jim
Using the 'nevermind' directive to test things clicking the link to the disallowed file brings up my custom 404 error page and no entry is written in .htaccess.
What I have done is create an unobtrusive link to /about.cgi?id=13.
In htaccess is this line: RedirectPermanent /about.cgi?id=13 [mysite.com...]
I am using trap.cgi as the filename because when using trap.pl the icon in the upload program doesn't seem right as if it doesn't recognise the extension. Despite this I tried a few times with trap.pl as the name but same result.
I'm wondering particularly about this line in trap.pl
# Form full pathname to .htaccess file
$htapath = "$htadir"."$htafile";
Is there anything I need to change here given that key_masters version specifies a full path like this:
# This is the only variable that needs to be modified. Replace it with the absolute path to your root directory.
$rootdir = "/home/www/your_root_directory";
That will generate an external 301 redirect, which is not what you want. Use mod_rewrite to do a "silent" internal redirect instead:
RewriteCond %{QUERY_STRING} ^id=13$
RewriteRule ^about\.cgi /cgi-bin/trap.cgi [L]
HTH,
Jim