Forum Moderators: phranque

Message Too Old, No Replies

New installation of the Bad Bot script

Working a little differently than others specific to a site

         

T_Rex

2:50 am on Sep 21, 2004 (gmt 0)

10+ Year Member



I considered installing my first bad bot srcipt, but had some reservations discussed here: [webmasterworld.com...]
Two posts responding by Jim and py9jmas are appreciated.
I reconsidered and now have it going. got the first one trapped this afternoon from Conneticut. To me it's exciting; I'm an Engineer, not in IT, so I'm slow, poderous, and dangerous. :-)
Our market is in US and a little Canada only, because we won't ship our live campanion animals. Take this into consideration when you look at these wide-sweeping bans.
The primary motivation for the script was not reducing bandwidth, but in stopping marauding on our site by jealous competition. It seems they engage in gossip and rumor about our private site on these public "groups" and link to various pages on our site.
So it is basically a 4 prong approach with
1) a bad bot script,
2) rewriting referals from badgossip\.improperly-moderated\.com straight into the trap.
3) rewritiing the worm.txt to hacker signatures and worms, and
4) blocking with mod_access the entire continents (asia-pacific and RIPE) where we don't need to be paying for the bandwidth to feed hungry spiders and customers that could never be.
It's a little of this and that. Before I show .htaccess to you, the env is not getout,; I changed it to ban. the script is a .cgi modified by Jim in this thread: [webmasterworld.com...]
I don;t want e-mails for each ban. The bait is a .gif in a .htm hyperlinking to another .htm that gets the rewrite:
SetEnvIf referer .*\forums.* ban
SetEnvIf referer .*\forum.* ban
SetEnvIf referer .*\thread.* ban
SetEnvIf referer .*\p3p.* ban
SetEnvIf referer .*\web-log.* ban
SetEnvIf Remote_Addr ^0\.0\.0\.0$ ban
#MORE LIKE ABOVE
# Block bad-bots with cgi script
SetEnvIf Request_URI "^(/403.*\.html¦/403.*\.htm¦/robots\.txt)$" allowsome
<Files *>
order deny,allow
deny from env=ban
allow from env=allowsome
# Japan
deny from 43.0.0.0/8
# Japan and others
deny from 133.0.0.0/8
# LAC
deny from 200.0.0.0/7
# NORSK
deny from 32.0.0.0/8
# RIPE europe
deny from 62.0.0.0/8
deny from 80.0.0.0/5
deny from 88.0.0.0/8
deny from 193.0.0.0/7
deny from 195.0.0.0/8
deny from 212.0.0.0/7
deny from 217.0.0.0/8
# Asia-Pacific
deny from 58.0.0.0/6
deny from 196.192.0.0/13
deny from 202.0.0.0/7
deny from 210.0.0.0/7
deny from 218.0.0.0/6
deny from 222.0.0.0/8
deny from 169.208.0/12
# Misc unwanted in ARIN
deny from blah-blah
deny from blah-blah
</Files>

# Block image inclusion outside our domain except Google, AltaVista, Gigablast, and Comet Systems translators and caches
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_REFERER}!^$
RewriteCond %{HTTP_REFERER}!^http://(www\.)?mydomain\.org [NC]
RewriteCond %{HTTP_REFERER}!^http://216\.239\.(3[2-9]¦[45][0-9]¦6[0-3]).*(www\.)?mydomaine\.org [NC]
RewriteCond %{HTTP_REFERER}!^http://babel.altavista.com/.*(www\.)?myotherdomain\.com [NC]
RewriteCond %{HTTP_REFERER}!^http://216\.243\.113\.1/cgi/
RewriteCond %{HTTP_REFERER}!^http://search.*\.cometsystems\.com/search.*(www\.)?mydomain\.org [NC]
RewriteRule \.(gif¦jpg¦jpeg?)$ - [NC,F]

# Forbid if blank *and* UA
RewriteCond %{HTTP_referer} ^$
RewriteCond %{HTTP_user_agent} ^$
RewriteRule .* - [F]
# Forbid if *faked* blank referer *or* UA
RewriteCond %{HTTP_referer} ^-$ [OR]
RewriteCond %{HTTP_user_agent} ^-$
RewriteRule .* - [F]
# Send em internally to the trap
RewriteCond %{HTTP_referer} badgossip\.improperly-moderated\.com/ [NC]
RewriteRule!^cgi\/trap\.cgi$ /cgi/trap.cgi [L]
# Internally re-direct html trap-bait-directory and contents to trap
# Also the templates directory and content
RewriteCond %{request_uri} directory\-name
RewriteRule ^directory\-name\/.* /cgi/trap.cgi [L]
RewriteCond %{request_uri} templates
RewriteRule ^templates\/.* /cgi/trap.cgi [L]
# worm and exploit #*$!:
RewriteCond %{request_uri} \_vti\_ [NC,OR]
RewriteCond %{request_uri} (/c\+dir¦CAPREQ¦owssvr¦cltreq¦script\>¦\[drive\-letter\]¦\[server\-name\]¦NULL) [NC,OR]
RewriteCond %{request_uri} (nobody¦form¦mail¦cmd¦root¦autoexec¦shell)(2¦to)?\.(bat¦asp¦cgi¦exe¦php¦pl¦pm) [NC,OR]
RewriteCond %{request_uri} (\.\.¦\*¦'¦\(.*\)¦\+) [OR]
RewriteCond %{request_uri} \.ida [NC] [OR}
# web-ripping tools:
RewriteCond %{HTTP_USER_AGENT} (curl¦Dart.?Communications¦Enfish¦htdig¦Java¦larbin) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (FrontPage¦Indy.?Library¦RPT\-HTTPClient) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (libwww¦lwp¦PHP¦Python¦www\.thatrobotsite\.com¦webbandit¦Wget¦Zeus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (Microsoft¦MFC).(Data¦Internet¦URL¦WebDAV¦Foundation).(Access¦Explorer¦Control¦MiniRedir¦Class) [NC,OR]
# Image-grabbers
RewriteCond %{HTTP_USER_AGENT} (AcoiRobot¦FlickBot¦webcollage) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (Express¦Mister¦Web).?(Web¦Pix¦Image).?(Pictures¦Collector)? [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Image.?(fetch¦Stripper¦Sucker) [NC,OR]
# higher bandwidth users
RewriteCond %{HTTP_REFERER} iaea\.org [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (LinkWalker¦ia_archiver¦NPBot) [NC]
RewriteRule .* /worms.txt [L]

<Files .htaccess>
order deny,allow
deny from all
</Files>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

jdMorgan

3:01 am on Sep 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



T_Rex,

I'd suggest a modification to the blank-UA blocking code, in order to avoid blocking AOL users:


# Forbid if blank *and* UA, except for HEAD requests
RewriteCond %{REQUEST_METHOD} !^HEAD$
RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule .* - [F]

AOL's proxy cache does HEAD requests with blank referrer and blank UA in order to see if the page has changed. If not, AOL serves the request from their cache. If the page has changed, they reload their cache. Other ISPs may do the same thing, but AOL is the biggest and therefore of most concern.

Jim

T_Rex

4:29 pm on Sep 21, 2004 (gmt 0)

10+ Year Member



Thanks a million, Jim. I'll get right on it, I don't want to loose any AOL traffic. So far, the newly banned IP in Conneticut (on the .htaccess ban env), came back today for another download attempt and got a 403.
Then I banned the IP of my computer behind a different, static ISP in another location "B", and can't get in through that ISP to the domaine with the trap as desired. Now that location "B" is temporarily banned and test-verified, it can get around the ban by going first to "Free private Surfing" at three diferent web proxy service sites I tried with free trial demonstrations. But if you hit the trap, I was banned again through them too. Too much!
Ha this is fun.