Forum Moderators: phranque

Message Too Old, No Replies

Combining commands in htaccess

How do I combine these two things?

         

mikemwe

8:35 pm on Apr 24, 2007 (gmt 0)

10+ Year Member



I'm new to htaccess and want to stop hijackers using the ban list everyone seems to be using:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} httrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]

Also I want to add this to prevent from linking to images:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?example.org/.*$ [NC]
RewriteRule \.(gif¦jpg¦js¦css)$ - [F]

How do I combine them so it works? I don't see anywhere how to keep adding things to an htaccess file. Thanks!

[edited by: jdMorgan at 1:32 pm (utc) on April 25, 2007]
[edit reason] Examplified, formatting tweaked. [/edit]

jdMorgan

1:30 pm on Apr 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Putting it into the file exactly as you posted it above is fine.

Be aware of these things:

  • The [L] flag used with [F] is redundant. Your first rule will work fine with just [F] at the end.
  • You, or the person who posted this code originally, have start-anchored all the user-agent names. The result will be that only user-agents which start with the specified string will be blocked. In fact, several or many of those user-agent strings don't start with the specified string, they only contain that string. For those user-agents, you'll need to remove the start anchor "^" character.
  • Many of those user-agents are moribund. I wouldn't waste CPU time on trying to block a user-agent that never abuses your site. Remember that this entire list will be processed for each and every HTTP request to t your server -- That is, for every page, every image, CSS file, etc. Review the server logs for the past year, and include only those user-agents that you actually find in the logs in your list of blocked user-agents.
  • The regular-expressions pattern ".*$" is a waste of CPU time. Behaviour will be identical if you simply omit it entirely. Similarly, "^.*" can be replaced by the equivalent ".*" with no change to function.
  • Posting on this forum modifies the code and makes it invalid. Specifically, a space is required between "!" and any preceding characters, and the broken pipe character "¦" must be changed to a solid pipe character before use.
  • When adding user-agents to the blocking list, be sure that all user-agent RewriteConds have an [OR] flag except for the last one.

    For more information, see the documents cited in our forum charter [webmasterworld.com] (see the cited regular-expressions tutorial) and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

    Jim

  • mikemwe

    2:59 pm on Apr 25, 2007 (gmt 0)

    10+ Year Member



    Thanks jdMorgan. All of that really helps.

    wilderness

    8:18 pm on Apr 25, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]

    You may also replace all these with a single line:

    RewriteCond %{HTTP_USER_AGENT} ^Web [OR]

    The same reductions apply to Get, Email, Imag, Offline and Super. (Reduce them all to a single line by using "begins with".

    g1smd

    12:54 am on Apr 28, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    You could contract the list, but will that block anything that you actually wanted to let in?

    Check carefully.