Forum Moderators: phranque
I'm trying to ban sites by domain name, since there are recently lots of reference spammers.
I have, for example, the rule:
RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*stuff.*\.com/.*$ [NC]
RewriteRule ^.*$ - [F,L]
which should ban any sites containing the word "stuff"
www.stuff.com
www.whatkindofstuff.com
www.some-other-stuff.com
and so on.
However, it is not working, so I am sure I did not setup a proper pattern match rule. Anyone care to advise?
[edited by: jatar_k at 5:06 am (utc) on May 20, 2003]
Here is what I tried, that does not work:
Options +FollowSymLinks
RewriteEngine On# This is the first Rewrite Cond and Rule
RewriteCond %{HTTP_USER_AGENT} ^User_Agent_to_ban$ [NC]
RewriteRule .* /includes/banned.html [L]
# other conditions follow, not covering this one
# Here is part of the main RewriteRule for other conditions:
RewriteRule !^(includes/403\.html¦/includes/banned\.html) - [F]
I have allowed /includes/banned\.html in my universal RewriteRule and in my allowit rule.
TIA, Wiz
I have tried browsing my website with this code in place, and not even the 403 page is available to it. As Judge Judy would say: Perfect!
An alternative way is to make your 403 error document a script, or rather: Use a script as 403 error document. That way you can set up even more sophisticated rules, eg. using databases.
/claus
Total(Gigs) 925.31
this is my total GB transfer so far this month (yes in 11 days)which is costing me a bloody fortune. I've posted my .htaccess that I just added today after reading through this thread (and the prior one) but since I don't understand mod_rewrite very well or regular expressions I hoping that some of the more SR members here can help me refine it.
what I am hoping to accoplish with my .htaccess
block downloaders/site rippers
block people that try to directly link to the content
people that have no reffer value
block EVERYONE that has the word "forum" in there reffer
what hasn't been accoplished that I would still like to
block reffers from specific TLD's (IE .jp .ch .nl)
exploit scan blocking (something simular to this thread but I can't find anything more specific or more explanatory [webmasterworld.com...]
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^-?$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^-?$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^Web.?(Auto¦Cop¦dup¦Fetch¦Filter¦Gather¦Go¦Leach¦Mine¦Mirror¦Pix¦QL¦RACE¦Sauger) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.?(site.?(eXtractor¦Quester)¦Snake¦ster¦Strip¦Suck¦vac¦walk¦Whacker¦ZIP) [NC,OR]
#RewriteCond %{HTTP_USER_AGENT} ^(Microsoft¦MFC).(Data¦URL¦WebDAV¦Foundation).(Access¦Control¦MiniRedir¦Class) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(BlackWidow¦Crescent¦Disco.?¦ExtractorPro¦HTML.?Works¦Franklin.?Locator¦HLoader¦http.?generic¦Industry.?Program¦IUPUI.?Research.?Bot¦Mac.?Finder¦NetZIP¦NICErsPRO¦NPBot¦PlantyNet_WebRobot¦Production.?Bot¦Program.?Shareware¦Teleport.?Pro¦TurnitinBot¦TE¦VoidEYE¦WebBandit¦WebCopier¦WEP.?Search¦Wget¦Zeus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} cherry.?picker¦e?mail.?(collector¦extractor¦magnet¦reaper¦siphon¦sweeper¦harvest¦collect¦wolf) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Educate.?Search¦Full.?Web.?Bot¦Indy.?Library¦IUFW.?Web [NC,OR]
RewriteCond %{HTTP_USER_AGENT} httrack¦larbin¦NaverRobot¦Siphon¦SURF [NC,OR]
RewriteCond %{HTTP_USER_AGENT} efp@gmx\.net [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.?URL.?Control [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Miss.*g.*.?Locat.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.06\ \(Win95;\ I\) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible\ ;\ MSIE.? [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible;\ MSIE\ 5\.00;\ Windows\ 98$ [NC,OR]
# The next lines block NPBot by IP
RewriteCond %{REMOTE_ADDR} ^12\.148\.196\.(12[8-9]¦1[3-9][0-9]¦2[0-4][0-9]¦25[0-5])$ [OR]
RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.(19[2-9]¦2[0-4][0-9]¦25[0-5])$ [OR]
RewriteCond %{REMOTE_ADDR} ^12\.175\.0\.(3[2-9]¦4[0-7])$ [OR]
RewriteCond %{REMOTE_ADDR} ^(203\.186\.145\.225¦218\.6\.10\.113¦68\.59\.94\.40¦66\.75\.128\.202)$ [OR]
RewriteCond %{REMOTE_ADDR} ^210\.192\.(9[6-9]¦1[0-1][0-9]¦12[0-7])\. [OR]
RewriteCond %{REMOTE_ADDR} ^211\.(1[0-1][4-9])\. [OR]
RewriteCond %{REMOTE_ADDR} ^218\.([0-2][0-9]¦[3][0-1])\. [OR]
RewriteCond %{REMOTE_ADDR} ^218\.(5[6-9]¦[6-9][0-9])\. [OR]
# Start Cyveillance blocks
RewriteCond %{REMOTE_ADDR} ^63\.148\.99\.2(2[4-9]¦[3-4][0-9]¦5[0-5])$ [OR]
RewriteCond %{REMOTE_ADDR} ^65\.118\.41\.(19[2-9]¦2[0-1][0-9]¦22[0-3])$ [OR]
# End Cyveillance blocks
RewriteCond %{HTTP_REFERER} q=guestbook [NC,OR]
RewriteCond %{HTTP_REFERER} iaea\.org [NC]
RewriteRule ^.*$ [badplace.com...]
# Forbid requests for exploits & annoyances
#
# Bad requests
RewriteCond %{REQUEST_METHOD}!^(GET¦HEAD¦POST) [NC,OR]
# CodeRed
RewriteCond %{REQUEST_URI} ^/default\.(ida¦idq) [NC,OR]
RewriteCond %{REQUEST_URI} ^/.*\.printer$ [NC,OR]
# Email
RewriteCond %{REQUEST_URI} (mail.?form¦form¦form.?mail¦mail¦mailto)\.(cgi¦exe¦pl)$ [NC,OR]
# MSOffice
RewriteCond %{REQUEST_URI} ^/(MSOffice¦_vti) [NC,OR]
# Nimda
RewriteCond %{REQUEST_URI} /(admin¦cmd¦httpodbc¦nsiislog¦root¦shell)\.(dll¦exe) [NC,OR]
# Various
RewriteCond %{REQUEST_URI} ^/(bin/¦cgi/¦cgi\-local/¦sumthin) [NC,OR]
RewriteCond %{THE_REQUEST} ^GET\ http [NC,OR]
RewriteCond %{REQUEST_URI} /sensepost\.exe [NC]
# Forbid if UA is a single word - case-insensitive, A-Z only
RewriteCond %{HTTP_USER_AGENT} ^[a-z]+$ [NC]
# Some exemptions though...
RewriteCond %{HTTP_USER_AGENT}!^ColdFusion$ [OR]
RewriteCond %{HTTP_USER_AGENT}!^DeepIndex$ [OR]
RewriteCond %{HTTP_USER_AGENT}!^FavOrg$ [OR]
RewriteCond %{HTTP_USER_AGENT}!^MantraAgent$ [OR]
RewriteCond %{HTTP_USER_AGENT}!^MARTINI$ [OR]
# Address harvesters
RewriteCond %{HTTP_USER_AGENT} ^(autoemailspider¦ExtractorPro) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^E?Mail.?(Collect¦Harvest¦Magnet¦Reaper¦Siphon¦Sweeper¦Wolf) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (DTS.?Agent¦Email.?Extrac) [NC,OR]
RewriteCond %{HTTP_REFERER} iaea\.org [NC,OR]
# Download managers
RewriteCond %{HTTP_USER_AGENT} ^(Alligator¦DA.?[0-9]¦DC\-Sakura¦Download.?(Demon¦Express¦Master¦Wonder)¦FileHound) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Flash¦Leech)Get [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Fresh¦Lightning¦Mass¦Real¦Smart¦Speed¦Star).?Download(er)? [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Gamespy¦Go!Zilla¦iGetter¦JetCar¦Net(Ants¦Pumper)¦SiteSnagger¦Teleport.?Pro¦WebReaper) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(My)?GetRight [NC,OR]
# Image-grabbers
RewriteCond %{HTTP_USER_AGENT} ^(AcoiRobot¦FlickBot¦webcollage) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Express¦Mister¦Web).?(Web¦Pix¦Image).?(Pictures¦Collector)? [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image.?(fetch¦Stripper¦Sucker) [NC,OR]
# "Gray-hats"
RewriteCond %{HTTP_USER_AGENT} ^(Atomz¦BlackWidow¦BlogBot¦EasyDL¦Marketwave¦Sqworm¦SurveyBot¦Webclipping\.com) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (girafa\.com¦gossamer\-threads\.com¦grub\-client¦Netcraft¦Nutch) [NC,OR]
# Site-grabbers
RewriteCond %{HTTP_USER_AGENT} ^(eCatch¦(Get¦Super)Bot¦Kapere¦HTTrack¦JOC¦Offline¦UtilMind¦Xaldon) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.?(Auto¦Cop¦dup¦Fetch¦Filter¦Gather¦Go¦Leach¦Mine¦Mirror¦Pix¦QL¦RACE¦Sauger) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.?(site.?(eXtractor¦Quester)¦Snake¦ster¦Strip¦Suck¦vac¦walk¦Whacker¦ZIP) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCapture [NC,OR]
# Tools
RewriteCond %{HTTP_USER_AGENT} ^(curl¦Dart.?Communications¦Enfish¦htdig¦Java¦larbin) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (FrontPage¦Indy.?Library¦RPT\-HTTPClient) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(libwww¦lwp¦PHP¦Python¦www\.thatrobotsite\.com¦webbandit¦Wget¦Zeus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Microsoft¦MFC).(Data¦Internet¦URL¦WebDAV¦Foundation).(Access¦Explorer¦Control¦MiniRedir¦Class) [NC,OR]
# Unknown
RewriteCond %{HTTP_USER_AGENT} ^(Crawl_Application¦Lachesis¦Nutscrape) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^[CDEFPRS](Browse¦Eval¦Surf) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Demo¦Full.?Web¦Lite¦Production¦Franklin¦Missauga¦Missigua).?(Bot¦Locat) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (efp@gmx\.net¦hhjhj@yahoo\.com¦lerly\.net¦mapfeatures\.net¦metacarta\.com) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Industry¦Internet¦IUFW¦Lincoln¦Missouri¦Program).?(Program¦Explore¦Web¦State¦College¦Shareware) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Mac¦Ram¦Educate¦WEP).?(Finder¦Search) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Moz+illa¦MSIE).?[0-9]?.?[0-9]?[0-9]?$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/[0-9]\.[0-9][0-9]?.\(compatible[\)\ ] [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NaverRobot [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*forum*.*\.com [NC]
RewriteRule ^.*$ [aaa.sitethatdoesstuffworsethenmine.com...]
I hoping that some of the more SR members here can help me refine it
This board says I'm a Junior Member, but since no one else has posted yet, I figured I might as well.
Okay, I guess I'll just go down the list:
block downloaders/site rippers
I'm not quite sure about who you want to block, so I'll ignore this for now.
block people that try to directly link to the content
Generally, you'll use something like this:
RewriteCond %{HTTP_REFERER} !(www\.)?mysite.com/ [NC]
That would check to see if the referrer is not from your site.
people that have no reffer value
Your code:
RewriteCond %{HTTP_REFERER} ^-?$ [NC]
block EVERYONE that has the word "forum" in there reffer
Your code:
RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*forum*.*\.com [NC]
Since you want to block any referrer that contains forum, this should work for your purposes:
RewriteCond %{HTTP_REFERER} forum [NC]
block reffers from specific TLD's (IE .jp .ch .nl)
I'll let you start this one first. It's not that hard.
exploit scan blocking (something simular to this thread but I can't find anything more specific or more explanatory [webmasterworld.com...]Not quite sure what you mean, so I'll ignore it for now. I'm not a preferred member, so I don't have access to the thread you quoted.
In case you're stuck, here are some references I'd recommend:
mod_rewrite [httpd.apache.org]
Regular expressions [etext.lib.virginia.edu]