Forum Moderators: open

Message Too Old, No Replies

SetEnvIf User-Agent

How to Ban Bots with this Tool

         

frontpage

12:02 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't have Mod_Rewrite on my Apache server.

However, I too would like to Ban Bad Bots.

Does anyone have an example .htaccess that they put together that uses the SetEnvIf User-Agent method of banning bots?

Please post it if you got it!

littleman

12:19 am on Mar 25, 2002 (gmt 0)



setenvif User-Agent ^BadBot getout
Order Allow,Deny
Allow from=all
Deny from env=getout

frontpage

12:25 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the response. Is your example comprehensive and as it appears in your .htaccess file?

What I mean is that you posted:

setenvif User-Agent ^BadBot getout
Order Allow,Deny
Allow from=all
Deny from env=getout

Where is ^BadBot defined? Do you have a list? How do you define a ^BadBot?

I am looking for someones already fully functioning .htaccess file to be more specific.

bobriggs

12:34 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Additionally, you'll have to have apache version 1.3.13 or higher for .htaccess

Apache mod_setenvif [httpd.apache.org]

Key_Master

12:35 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is a functional .htaccess file:


SetEnvIf User-Agent ^BadBot getout
<Limit GET POST>
order allow,deny
deny from env=getout
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
<Files ~ "\.htaccess$">
order deny,allow
deny from all
</Files>

littleman

12:45 am on Mar 25, 2002 (gmt 0)



:) I guess I should do a bit of an explanation 'BadBot' would be the useragent name. It will work with partial matches. If you had"
setenvif User-Agent BadBot getout
Then 'aBadBot' and 'BadBot/1.11' would both be denied. The '^' means the beginning of the string so:
setenvif User-Agent ^BadBot getout
will block 'BadBot' 'BadBot/2.1' but not, 'MyBadBot'.

This would go into your .htaccess file. If you want to ban multiple bots you would do multiple setenvifs, example:
setenvif User-Agent ^BadBot getout
setenvif User-Agent ^SpamBot getout
setenvif User-Agent ^Iwontyourimages getout
setenvif User-Agent ^Slurp getout
Order Allow,Deny
Allow from=all
Deny from env=getout

You could also set it up for specific files:
setenvif User-Agent ^Anotherbadone keep_out
<Files ~ "(\.jpg¦\.gif¦\.what_ever_else)$">
order allow,deny
allow from all
deny from env=keep_out
</Files>

frontpage

1:45 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do you think this .htaccess file will work?

setenvif User-Agent ^attach getout
setenvif User-Agent ^BackWeb getout
setenvif User-Agent ^Bandit getout
setenvif User-Agent ^BatchFTP getout
setenvif User-Agent ^Bot\ mailto:craftbot@yahoo.com getout
setenvif User-Agent ^Buddy getout
setenvif User-Agent ^ChinaClaw getout
setenvif User-Agent ^Collector getout
setenvif User-Agent ^Copier getout
setenvif User-Agent ^DA getout
setenvif User-Agent ^DISCo\ Pump getout
setenvif User-Agent ^Download\ Demon getout
setenvif User-Agent ^Download\ Wonder getout
setenvif User-Agent ^Downloader getout
setenvif User-Agent ^Drip getout
setenvif User-Agent ^eCatch getout
setenvif User-Agent ^EirGrabber getout
setenvif User-Agent ^Express\ WebPictures getout
setenvif User-Agent ^ExtractorPro getout
setenvif User-Agent ^EyeNetIE getout
setenvif User-Agent ^FileHound getout
setenvif User-Agent ^FlashGet getout
setenvif User-Agent ^GetRight getout
setenvif User-Agent ^GetSmart getout
setenvif User-Agent ^Go!Zilla getout
setenvif User-Agent ^Go-Ahead-Got-It getout
setenvif User-Agent ^gotit getout
setenvif User-Agent ^Grabber getout
setenvif User-Agent ^GrabNet getout
setenvif User-Agent ^Grafula getout
setenvif User-Agent ^HMView getout
setenvif User-Agent ^HTTrack getout
setenvif User-Agent ^InterGET getout
setenvif User-Agent ^Internet\ Ninja getout
setenvif User-Agent ^Iria getout
setenvif User-Agent ^JetCar getout
setenvif User-Agent ^JOC getout
setenvif User-Agent ^JustView getout
setenvif User-Agent ^larbin getout
setenvif User-Agent ^LeechFTP getout
setenvif User-Agent ^lftp getout
setenvif User-Agent ^likse getout
setenvif User-Agent ^Magnet getout
setenvif User-Agent ^Mag-Net getout
setenvif User-Agent ^Mass\ Downloader getout
setenvif User-Agent ^Memo getout
setenvif User-Agent ^MIDown\ tool getout
setenvif User-Agent ^Mirror getout
setenvif User-Agent ^Mister\ PiX getout
setenvif User-Agent ^Navroad getout
setenvif User-Agent ^NearSite getout
setenvif User-Agent ^NetAnts getout
setenvif User-Agent ^NetSpider getout
setenvif User-Agent ^Net\ Vampire getout
setenvif User-Agent ^NetZip getout
setenvif User-Agent ^Ninja getout
setenvif User-Agent ^Octopus getout
setenvif User-Agent ^Offline\ Explorer getout
setenvif User-Agent ^PageGrabber getout
setenvif User-Agent ^Papa\ Foto getout
setenvif User-Agent ^pcBrowser getout
setenvif User-Agent ^Pockey getout
setenvif User-Agent ^Pump getout
setenvif User-Agent ^RealDownload getout
setenvif User-Agent ^Reaper getout
setenvif User-Agent ^Recorder getout
setenvif User-Agent ^ReGet getout
setenvif User-Agent ^Siphon getout
setenvif User-Agent ^SiteSnagger getout
setenvif User-Agent ^SmartDownload getout
setenvif User-Agent ^Snake getout
setenvif User-Agent ^SpaceBison getout
setenvif User-Agent ^Stripper getout
setenvif User-Agent ^Sucker getout
setenvif User-Agent ^SuperBot getout
setenvif User-Agent ^SuperHTTP getout
setenvif User-Agent ^Surfbot getout
setenvif User-Agent ^tAkeOut getout
setenvif User-Agent ^Teleport\ Pro getout
setenvif User-Agent ^Vacuum getout
setenvif User-Agent ^VoidEYE getout
setenvif User-Agent ^Web\ Image\ Collector getout
setenvif User-Agent ^Web\ Sucker getout
setenvif User-Agent ^WebAuto getout
setenvif User-Agent ^WebCopier getout
setenvif User-Agent ^WebFetch getout
setenvif User-Agent ^WebReaper getout
setenvif User-Agent ^WebSauger getout
setenvif User-Agent ^Website getout
setenvif User-Agent ^Webster getout
setenvif User-Agent ^WebStripper getout
setenvif User-Agent ^WebWhacker getout
setenvif User-Agent ^WebZIP getout
setenvif User-Agent ^Wget getout
setenvif User-Agent ^Whacker getout
setenvif User-Agent ^Widow getout
setenvif User-Agent ^Xaldon getout
Order Allow,Deny
Allow from=all
Deny from env=getout

littleman

2:54 am on Mar 25, 2002 (gmt 0)



One thing, and it is my typo. Switch 'Allow from=all' to 'Allow from all'.

frontpage

3:05 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am getting internal server error when using this .htaccess file even with the "allow" section is corrected.

What should the file permission be set to? 644

Am I missing something in the .htaccess file like /files at the end?

Thanks

Key_Master

3:36 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



frontpage/

An escaped space (\ ) will not work for this purpose in .htaccess

Try replacing them with a period (.) and escape any real periods. Example:

SetEnvIf User-Agent ^Bot.mailto:craftbot@yahoo\.com getout

Also I would try using the code below.

<Limit GET POST>
order allow,deny
deny from env=getout
allow from all
</Limit>

bobriggs

3:37 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



which version of apache server are you using?

littleman

5:34 am on Mar 25, 2002 (gmt 0)



I think the error may be because of the whitespace in some of the user agents. I am not sure if '\s' works with SetEnvIf or not -- investigating.

Key_Master

5:41 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah, I tried \s but it didn't work. There has to be something better but until then, an unescaped period will work.

littleman

6:00 am on Mar 25, 2002 (gmt 0)



Got it:
setenvif User-Agent "^Web Image Collector" getout

Quotations will get apache to recognize the spaces in a string.

Key_Master

6:05 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Beat me to it! And wildcards are still allowed. ;)

littleman

6:07 am on Mar 25, 2002 (gmt 0)



Some people do crossword puzzles, we play with apache settings.

bird

11:55 am on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



setenvif User-Agent ^SpaceBison getout

Are your sure you want to block Proxomitron?
(unless you need to keep Brett [searchengineworld.com] out of your site, of course... ;))

frontpage

12:50 pm on Mar 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here is my revised .htaccess. What do you think?

setenvif User-Agent ^attach getout
setenvif User-Agent ^BackWeb getout
setenvif User-Agent ^Bandit getout
setenvif User-Agent ^BatchFTP getout
SetEnvIf User-Agent ^Bot.mailto:craftbot@yahoo\.com getout
setenvif User-Agent ^Buddy getout
setenvif User-Agent ^ChinaClaw getout
setenvif User-Agent ^Collector getout
setenvif User-Agent ^Copier getout
setenvif User-Agent ^DA getout
setenvif User-Agent "^DISCo Pump" getout
setenvif User-Agent "^Download Demon" getout
setenvif User-Agent "^Download Wonder" getout
setenvif User-Agent ^Downloader getout
setenvif User-Agent ^Drip getout
setenvif User-Agent ^eCatch getout
setenvif User-Agent ^EirGrabber getout
setenvif User-Agent "^Express WebPictures" getout
setenvif User-Agent ^ExtractorPro getout
setenvif User-Agent ^EyeNetIE getout
setenvif User-Agent ^FileHound getout
setenvif User-Agent ^FlashGet getout
setenvif User-Agent ^GetRight getout
setenvif User-Agent ^GetSmart getout
setenvif User-Agent ^Go!Zilla getout
setenvif User-Agent ^Go-Ahead-Got-It getout
setenvif User-Agent ^gotit getout
setenvif User-Agent ^Grabber getout
setenvif User-Agent ^GrabNet getout
setenvif User-Agent ^Grafula getout
setenvif User-Agent ^HMView getout
setenvif User-Agent ^HTTrack getout
setenvif User-Agent ^InterGET getout
setenvif User-Agent "^Internet Ninja" getout
setenvif User-Agent ^Iria getout
setenvif User-Agent ^JetCar getout
setenvif User-Agent ^JOC getout
setenvif User-Agent ^JustView getout
setenvif User-Agent ^larbin getout
setenvif User-Agent ^LeechFTP getout
setenvif User-Agent ^lftp getout
setenvif User-Agent ^likse getout
setenvif User-Agent ^Magnet getout
setenvif User-Agent ^Mag-Net getout
setenvif User-Agent "^Mass Downloader" getout
setenvif User-Agent ^Memo getout
setenvif User-Agent "^MIDown tool" getout
setenvif User-Agent ^Mirror getout
setenvif User-Agent "^Mister PiX" getout
setenvif User-Agent ^Navroad getout
setenvif User-Agent ^NearSite getout
setenvif User-Agent ^NetAnts getout
setenvif User-Agent ^NetSpider getout
setenvif User-Agent "^Net Vampire" getout
setenvif User-Agent ^NetZip getout
setenvif User-Agent ^Ninja getout
setenvif User-Agent ^Octopus getout
setenvif User-Agent "^Offline Explorer" getout
setenvif User-Agent ^PageGrabber getout
setenvif User-Agent "^Papa Foto" getout
setenvif User-Agent ^pcBrowser getout
setenvif User-Agent ^Pockey getout
setenvif User-Agent ^Pump getout
setenvif User-Agent ^RealDownload getout
setenvif User-Agent ^Reaper getout
setenvif User-Agent ^Recorder getout
setenvif User-Agent ^ReGet getout
setenvif User-Agent ^Siphon getout
setenvif User-Agent ^SiteSnagger getout
setenvif User-Agent ^SmartDownload getout
setenvif User-Agent ^Snake getout
setenvif User-Agent ^Stripper getout
setenvif User-Agent ^Sucker getout
setenvif User-Agent ^SuperBot getout
setenvif User-Agent ^SuperHTTP getout
setenvif User-Agent ^Surfbot getout
setenvif User-Agent ^tAkeOut getout
setenvif User-Agent "^Teleport Pro" getout
setenvif User-Agent ^Vacuum getout
setenvif User-Agent ^VoidEYE getout
setenvif User-Agent "^Web Image Collector" getout
setenvif User-Agent ^Web Sucker" getout
setenvif User-Agent ^WebAuto getout
setenvif User-Agent ^WebCopier getout
setenvif User-Agent ^WebFetch getout
setenvif User-Agent ^WebReaper getout
setenvif User-Agent ^WebSauger getout
setenvif User-Agent ^Website getout
setenvif User-Agent ^Webster getout
setenvif User-Agent ^WebStripper getout
setenvif User-Agent ^WebWhacker getout
setenvif User-Agent ^WebZIP getout
setenvif User-Agent ^Wget getout
setenvif User-Agent ^Whacker getout
setenvif User-Agent ^Widow getout
setenvif User-Agent ^Xaldon getout
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=getout
</Limit>

Mark_A

8:42 am on May 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looks like I found a similar thread :-)

Trying to keep out a spammer that was using an olf formmail to send spam ..
Have fixed formmail so we just log his requests, now want to deny just him access using his user agent as identity.

the end of his log entries look like:

HTTP/1.1" 200 1059 "-" "Microsoft URL Control - 6.00.8862"

So after reading this post I put an htaccess file in the document root like

setenvif User-Agent "Microsoft URL Control - 6.00.8862" sodoff
<Limit GET POST>
Order allow,deny
allow from all
deny from env=sodoff
</Limit>

But for some reason he is still here and still hitting getting 200 rather than 404 or 403 which I want him to get...

Any tips?

jdMorgan

6:38 pm on May 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As littleman pointed out back in March, partial matches can be used. One thing I like to do is to "wildcard" some of these directives to reduce the number required to block all these spambot UAs. I use the mod_rewrite method, but the principle is the same if you use mod_access. For example, instead of blocking "Download Demon" and "Download Wonder" using separate directives, I just nail them both by blocking "^Download\ ".

Similarly, there's no reason to include the version number for larbin and Microsoft URL Control, unless you really want to exclude just one version, and don't mind having to "upgrade" your blocking list every time a new version is released (unleashed) upon us.

Also, you can use full RegExp (Posix Regular Expressions) pattern matching by preceding your match string with a tilde (~) - at least in Apache 1.3 and later.

If you are going to use the allow,deny methods of mod_access, the order of allow and deny is critical. See: [httpd.apache.org...]

This may explain your trouble, Mark_A

HTH,

Jim

Mark_A

7:14 pm on May 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks jdMorgan for your response.

A couple of people have said that I should use mod_rewrite rather than mod_access and also that there is little point using the limit command.

Trouble is I am new to apache and therefore am getting to grips with the terminology.

It seems many of the varied options available to achieve a basic result (in this case keep someone out who is up to no good) can be carried out in the http.config file requiring a quick server reboot to take effect or in an htaccess file in the root which affects the second it is uploaded.

I am only at the stage of an early understading of each method and find the new terminology, inc that used at the apache site, means I am not at all sure which provides the least server load and which should be implemented where ... http.conf or htaccess etc ..

For example you mentioned
"^Download\ ".

so i understand there are a whole array of codes o wildcards like:

\^~

which have meaning to apache but they don't resemble dos or windows etc .. so I need some tutorial to get my grammar right .. any pointers?

I have been reading around the apache site but not found the glossary of codes or wildcards etc yet ..

One of the other things that immediately bugged me with my apache VS is that there are directories set up by the host which are like a windows "my documents" directory, they are echoed from elsewhere.... thus there appears duplication all over the flipping place .. enough gripes about it, getting the hang of it slowly :-)

Key_Master

11:46 pm on May 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not all servers are configured to protect /cgi-bin/ as diligently as the other directories. Why, I don't know.

Try:

SetEnvIfNoCase Request_URI formmail ban
<Files ~ "\.cgi$¦\.pl$">
order deny,allow
deny from env=ban
</Files>

Also, if you don't have mod_access, you might have to use <limit></limit> or something else like <files></files> to ban 'em.

Order Allow,Deny
Allow from=all
Deny from env=ban

by itself gives an internal server error on three different servers I tested on (none have mod_access privileges). When used with the Limit and Files directives, they did work. Would be interested in other peoples experiences with this. I'm developing software that needs to be as backwards compatible as possible.

Mark_A

11:58 pm on May 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>Not all servers are configured to protect /cgi-bin/ as diligently as the other directories. Why, I don't know. <

I wonder if it is something like the sort of echo files and dirs issue ... the cgi-bin is virtual .. set up as an alias or something iirc .. andis "actually" outside the doc root....

So perhaps order deny allow etc in an htaccess in the docroot will not protect it?

BTW I noticed that in a htaccess password protected directory area the "serve index.html" command had stopped and I was being served directory lists... why would that be, works in all the normal directories?

Key_Master wrt

SetEnvIfNoCase Request_URI formmail ban
<Files ~ "\.cgi$¦\.pl$">
order deny,allow
deny from env=ban
</Files>

Why would that ban only external users and not my site accesing the script also?

Key_Master

12:04 am on May 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>Why would that ban only external users and not my site accesing the script also?

Oh...I assumed formmail.cgi didn't exist on your server. You should rename it at once. Microsoft Url Control is not the only user agent seeking to exploit this script.

jdMorgan

3:07 am on May 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mark_A,

mod_rewrite uses Posix(tm) "regular expressions" - an old and very powerful string-and-character-matching "language".

It doesn't have simple wildcards like the DOS "*.*" but rather uses "." to mean "any character". Then, it expands that, allowing any number of any character, which would be ".*" . Note that the asterisk here means "any number of the preceding character", which in this case is "any character". So the final meaning is "any number of any characters". That's probably what throws off most people familiar with DOS-type wildcards.

Regular expressions, or RegEx for short, is a book unto itself. There is a handy glossary table on the Apache site inside the version 1.3 mod-rewrite document. Start there, and go through some of the "ban bot" examples in this thread and elsewhere here on wmw, decoding them for practice. After awhile you can read them pretty easily.

Another bit of confusion comes up with the ^ and $ string delimiters. ^ means "starts with" and $ means, "ends with". So, you can specify the exact string, "Download", with "^Download$" - note that the quotes are mine, and need not appear in a RewriteCond directive.

You could also specify "^Download", which means, "Starts with 'Download', and may or may not end with more characters." Or, you could use "Demon$", which would match anything ending with "Demon". And then to wrap up the delimiter examples, you could use, "^Download.*Demon$" which would match anything starting with "Download", ending with "Demon", and having any or no characters in between.

In order to say "I want to match a period", you have to "escape" the period, otherwise, as above, the period would mean "any character". The same is true of spaces, which need to be escaped, otherwise they mean "end of the string-to-be-matched specification". The dollar sign and several other characters with special meaning to RegEx need to be escaped, too, to avoid confusing RegEx processing. To escape them, you use "\". So, if you wanted to match the exact string "Download.Demon Web$ucker", you would write that as "^Download\.Demon\ Web\$Sucker$".

Anyway, RegEx is somewhat complex, but extremely powerful, and with some practice, it can actually be read and understood quickly.

Keep digging around in the Apache online docs - I learned about half of what I know about Apache and RegEx just from reading there and playing with it. Most of the rest, I picked up lurking here on wmw. :)

If you have access to httpd.config, you can do server-wide configurations there, and won't need to bother with individual .htaccess files in your site-level directories, except where you specifically want one site to be handled differently.

Those "copies" of directories you mentioned are probably Unix "links" where a file much like a Windoze "Shortcut" is used to "link" to a file in another directory. They can be handy if used properly.

HTH,

Jim

jdMorgan

3:48 am on May 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry - as if the above wasn't long enough already, I forgot to mention that all 11 of the user agents in frontpage's "ban list" above that start with "Web" can be banned with this one mod_rewrite directive:
RewriteCond %{HTTP_USER_AGENT} ^Web(\ Image¦\ Sucker¦Auto¦bandit¦Fetch¦site¦ZIP¦.*er) [NC,OR]

In addition to the RegExp stuff already mentioned above, this uses the "or" function, using the character "¦" with the terms to be "or'ed" enclosed inside parenthesis. The [NC,OR] at the end is peculiar to the RewriteCond directive, and means "No case match required", and, "OR this rewrite condition with the following rewrite condition" (not shown).

That was really my main point - to say that you don't have to have a line in your .htaccess file for *each* bot to be banned. A little "compression" can be achieved using regular expressions, and this may save a little filespace or CPU time.

"BTW I noticed that in a htaccess password protected directory area the "serve index.html" command had stopped and I was being served directory lists... why would that be, works in all the normal directories? "

Try adding:

Options -Indexes

up near the top of the .htaccess file in that directory. Seems like adding the .htaccess file to require passwords must have overridden the default. Options -Indexes is documented in Apache's Core Features document, under "Options."

Somebody else better post now - I think I've used up my WebmasterWorld allocation for the week!

Jim

Mark_A

7:32 am on May 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Jim many thanks for your two responses, this is just the sort of thing I am after, you have responded in english rather than machine code which is great :-)

The reason I find the apache site a little difficult is I think they take for granted that one already knows the file and command hierarchy, punctuation - syntax and individual wildcard charachters which obviously I don't yet.

Thanks again.

frontpage

9:18 pm on Aug 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here is my .htaccess file update.
Please note this for those who are Mod_Rewrite impaired servers.

setenvif HTTP_REFERER ^http://www.iaea.org getout
setenvif User-Agent ^attach getout
setenvif User-Agent ^Au2Email getout
setenvif User-Agent "^Advanced Email Extractor" getout
setenvif User-Agent ^BackWeb getout
setenvif User-Agent ^Bandit getout
setenvif User-Agent ^BatchFTP getout
setenvif User-Agent ^BaySpider getout
setenvif User-Agent ^BlackWidow getout
setenvIf User-Agent ^Bot.mailto:craftbot@yahoo\.com getout
setenvif User-Agent ^Buddy getout
setenvif User-Agent ^ChinaClaw getout
setenvif User-Agent ^Collector getout
setenvif User-Agent ^Copier getout
setenvif User-Agent ^Crescent getout
setenvif User-Agent ^curl getout
setenvif User-Agent ^DA getout
setenvif User-Agent "^DISCo Pump" getout
setenvif User-Agent "^Download Demon" getout
setenvif User-Agent "^Download Wonder" getout
setenvif User-Agent ^Downloader getout
setenvif User-Agent ^Drip getout
setenvif User-Agent ^eCatch getout
setenvif User-Agent ^e-collector getout
setenvif User-Agent ^EirGrabber getout
setenvif User-Agent ^EmailCollect getout
setenvif User-Agent ^EmailHarvest getout
setenvif User-Agent ^EmailMagnet getout
setenvif User-Agent ^EmailReaper getout
setenvif User-Agent ^EmailSiphon getout
setenvif User-Agent "^Email Spider" getout
setenvif User-Agent "^EmailWolf" getout
setenvif User-Agent "^Express WebPictures" getout
setenvif User-Agent ^ExtractorPro getout
setenvif User-Agent ^EyeNetIE getout
setenvif User-Agent ^FileHound getout
setenvif User-Agent ^Floodgate getout
setenvif User-Agent ^FlashGet getout
setenvif User-Agent ^****ybot getout
setenvif User-Agent ^GetRight getout
setenvif User-Agent ^GetSmart getout
setenvif User-Agent ^Go!Zilla getout
setenvif User-Agent ^Go-Ahead-Got-It getout
setenvif User-Agent ^gotit getout
setenvif User-Agent ^Grabber getout
setenvif User-Agent ^GrabNet getout
setenvif User-Agent ^Grafula getout
setenvif User-Agent ^HMView getout
setenvif User-Agent ^HTTrack getout
setenvif User-Agent ^InterGET getout
setenvif User-Agent "^Internet Ninja" getout
setenvif User-Agent ^Iria getout
setenvif User-Agent ^JetCar getout
setenvif User-Agent ^JOC getout
setenvif User-Agent ^JustView getout
setenvif User-Agent "^Kontiki Client" getout
setenvif User-Agent ^larbin getout
setenvif User-Agent ^Linkidator getout
setenvif User-Agent ^LeechFTP getout
setenvif User-Agent ^lftp getout
setenvif User-Agent ^likse getout
setenvif User-Agent ^Magnet getout
setenvif User-Agent ^Mag-Net getout
setenvif User-Agent "^Mail Harvester" getout
setenvif User-Agent "^Mass Downloader" getout
setenvif User-Agent ^Memo getout
setenvif User-Agent "^MIDown tool" getout
setenvif User-Agent "^Microsoft URL Control" getout
setenvif User-Agent ^Mirror getout
setenvif User-Agent "^Mister PiX" getout
setenvif User-Agent ^Navroad getout
setenvif User-Agent ^NearSite getout
setenvif User-Agent ^NetAnts getout
setenvif User-Agent ^NetSpider getout
setenvif User-Agent "^Net Vampire" getout
setenvif User-Agent ^NetZip getout
setenvif User-Agent ^Ninja getout
setenvif User-Agent ^Octopus getout
setenvif User-Agent "^Offline Explorer" getout
setenvif User-Agent ^PageGrabber getout
setenvif User-Agent "^Papa Foto" getout
setenvif User-Agent ^pcBrowser getout
setenvif User-Agent "^Pictures Grabber" getout
setenvif User-Agent ^Pockey getout
setenvif User-Agent ^psbot getout
setenvif User-Agent ^Pump getout
setenvif User-Agent ^RealDownload getout
setenvif User-Agent ^Reaper getout
setenvif User-Agent ^Recorder getout
setenvif User-Agent ^ReGet getout
setenvif User-Agent "^Road Runner: ImageScape Robot" getout
setenvif User-Agent ^Siphon getout
setenvif User-Agent ^SiteSnagger getout
setenvif User-Agent ^SlySearch getout
setenvif User-Agent ^SmartDownload getout
setenvif User-Agent ^Snake getout
setenvif User-Agent ^Stripper getout
setenvif User-Agent ^Sucker getout
setenvif User-Agent ^SuperBot getout
setenvif User-Agent ^SuperHTTP getout
setenvif User-Agent ^Surfbot getout
setenvif User-Agent "^Sqworm/2.9.85-BETA" getout
setenvif User-Agent ^tAkeOut getout
setenvif User-Agent ^Tcl_http_client_package getout
setenvif User-Agent "^Teleport Pro" getout
setenvif User-Agent ^Telesoft getout
setenvif User-Agent ^TurnitinBot getout
setenvif User-Agent ^URLBlaze getout
setenvif User-Agent ^Vacuum getout
setenvif User-Agent ^VobSub getout
setenvif User-Agent ^VoidEYE getout
setenvif User-Agent "^Web Image Collector" getout
setenvif User-Agent "^Web Sucker" getout
setenvif User-Agent ^WebAuto getout
setenvif User-Agent ^WebBandit getout
setenvif User-Agent ^WebCopier getout
setenvif User-Agent "^Web Downloader" getout
setenvif User-Agent ^WebEMailExtrac getout
setenvif User-Agent ^WebFetch getout
setenvif User-Agent ^WebMole getout
setenvif User-Agent ^WebMiner getout
setenvif User-Agent ^WebReaper getout
setenvif User-Agent ^WebSauger getout
setenvif User-Agent ^WebSnake getout
setenvif User-Agent ^Website getout
setenvif User-Agent ^Webster getout
setenvif User-Agent ^WebStripper getout
setenvif User-Agent ^WebWeasel getout
setenvif User-Agent ^WebWhacker getout
setenvif User-Agent ^WebZIP getout
setenvif User-Agent ^Wget getout
setenvif User-Agent ^Whacker getout
setenvif User-Agent ^Widow getout
setenvif User-Agent ^wysiwyg getout
setenvif User-Agent ^Xaldon getout
setenvif User-Agent ^Zeus getout
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=getout
</Limit>

It has been pretty effective. Please let me know if you have any suggestions! Cheers!

wilderness

2:39 am on Aug 11, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>setenvif User-Agent ^EmailCollect getout
setenvif User-Agent ^EmailHarvest getout
setenvif User-Agent ^EmailMagnet getout
setenvif User-Agent ^EmailReaper getout
setenvif User-Agent ^EmailSiphon getout
setenvif User-Agent "^Email Spider" getout
setenvif User-Agent "^EmailWolf" getout>

frontpage,
you can eliminate this dupilcations by using;
setenvif User-Agent ^Email getout
which should catch them all.

I don't see anything for Indy Library?
setenvif User-Agent Library$ getout

frontpage

1:14 pm on Aug 11, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the Indy library one.

One I forgot and got hit with last night 21,000 times was SpaceBison.

Add that to the list too..