Welcome to WebmasterWorld Guest from 54.197.94.141

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

A Close to perfect .htaccess ban list

   
3:30 am on Oct 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here's the latest rendition of my favorite ongoing artwork....my beloved .htaccess file. I've become quite fond of my little buddy, the .htaccess file, and I love the power it allows me to exclude vermin, pestoids and undesirable entities from my web sites

Gorufu, littleman, Air, SugarKane? You guys see any errors or better ways to do this....anybody got a bot to add....before I stick this in every site I manage.

Feel free to use this on your own site and start blocking bots too.

(the top part is left out)

<Files .htaccess>
deny from all
</Files>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]
RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]

3:51 am on Oct 23, 2001 (gmt 0)

10+ Year Member



Nice! Thanks for sharing that really cool info toolman. I can't spot any other bots at the moment.

Sticky

7:24 pm on Oct 23, 2001 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Very nice TM. How much speed difference can you notice on each page view?
8:14 pm on Oct 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>How much speed difference can you notice on each page view.

Couldn't say I notice any at all. The part above this though could determine that...if I run everything through the php parser I expect a hit. Usually I run AddHandlers for for ssi's and have never noticed a slow down.

BTW I pieced this together from snippets others posted here on the board.

8:26 pm on Oct 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Another one might be

RewriteCond %{HTTP_USER_AGENT} .*almaden.* [OR]
8:51 pm on Oct 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use .htaccess to remap third level domains to various directories based on HTTP_HOST. What happens it two rewritecond's apply to two separate rewrite rules (ie: I place some of these blocking lines above my third level domain remaps in my .htaccess file)?
11:21 am on Oct 24, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Toolman nice compilation of nasty bots! Have you tried sticking the re-writer in httpd.conf? It would run fastest there, although you noted that there was no noticeable speed difference as it is.

Thanks again for sharing it with us!

2:37 pm on Oct 24, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I found another UA for InternetSeer

RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]

Not sure what the difference is but this one is the one that comes by every fifteen minutes as my competition tries to fool me into thinking I have more traffic than I do. Now it's easily filtered as a 403.

Long live mod_rewrite :)

3:24 pm on Oct 24, 2001 (gmt 0)

10+ Year Member



toolman- have you been looking over my shoulder at 2 am? I thought *I* had some kind of unhealthy fixation with .htaccess. Guess not. And it may even be healthy, after all.

I've been going back and forth from a kind of banbot.cgi that reads a banned.txt file, to just drawing a line in the sand and doing the full-on mod_rewrite at the top level to initiate a trickle down effect on the sub domains I host.

What I've been toying with is a combination of my banned.txt file automatically updating my .htaccess file - using grep to insert/add/delete lines depending on what is in banned.txt. It's pretty easy to update my banned.txt file either by hand or with a little interface program I wrote - but I'm 'grappling with grep' to insert my lines in the correct place in the .htaccess file. I'm in the dark with grep. Grep vexes me. Grep makes my stomach hurt.

Has anyone else considered this, or is it too much work? I thought it would give me some flexibility, and kill two birds with one stone. In fact, at 2 am I think it's a brilliant idea. Then again, I don't get out much.

6:04 am on Oct 30, 2001 (gmt 0)

10+ Year Member



toolman, mind translating that for those of us are mod_rewrite impaired ?
10:09 pm on Jan 8, 2002 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Dredging this thread out of the depths of time.

Could someone please translate this line:

RewriteRule !^http://[^/.]\.your-site.com.* - [F]

Just wondering exactly what's happening there....

12:46 am on Jan 9, 2002 (gmt 0)

10+ Year Member



RewriteRule !^http://[^/.]\.your-site.com.* - [F]

is shorthand for "Get the hell out and don't come back 'cuz you aren't viewing a darned thing from this (my domain) today and as far as I'm concerned you get the big 'F' meaning - I (my domain) does not exist to you."

At least, that's my understanding. Apache has all that neat stuff posted. I forget most if it - always have to refer back.

"I'm not a smart man, Jenny" - Forrest Gump aka idiotgirl
<added>not a sig - just how I feel today,</added>

1:43 am on Jan 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



mivox thats blocking that screen scraper from iaea.org
5:57 pm on Jan 9, 2002 (gmt 0)

10+ Year Member



Thank you Toolman for the list.

I have added these to my htaccess which I have never really fooled around with before. Having now added these, can you tell me what I can expect?

Will its effect be a "lack of" data, meaning if these bots are excluded, my (a) logs will be smaller and (b) fewer email harvesters leading to less junk email and (c) less usage on the server. Have I got its' benefits right?

6:07 pm on Jan 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can expect a slight performance hit on your server...nothing major.

I really don't worry too much about email harvesters as I don't put email addresses on my site. The ones that iritate me are the site rippers. This is the latest version.

I know it could be shortened so if you're a unix geek please quit snickering and help us on the regex stuff. Thanks for your support ;)

RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]
RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.yo-do-main.net.* - [F]

6:22 pm on Jan 9, 2002 (gmt 0)

WebmasterWorld Senior Member rcjordan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



>expect

I installed TM's htaccess about 2 months ago, along with a trial run of a script to email me when one of these tripped an error code. Luckily, I decided to run it on a single site rather than 40 of them. I was deluged by error notifications, I had to repoint it to an error form to save my inbox. Expect to be surprised.

BTW, I now have it on all sites and server performance does seem to be slightly improved.

6:58 pm on Jan 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RewriteRule !^http://[^/.]\.your-site.com.* - [F]

  • ! If the requested URL is NOT of the following form:

    1. ^ directly at the beginning of the string
    2. http:// this string literally
    3. [^/.] one character that is not a slash or a dot (probably meant to read [^/.]+ for "one or more of those")
    4. \. a literal dot (escaped)
    5. your-site.com this string literally (almost, as the unescaped dot will match any arbitrary character)
    6. .* any trailing characters (or none)

  • - dont't rewrite the URL
  • [F] return a "403 forbidden" to the client

This means that the rule would theoretically be applied to all requests that ask your server for a page from from a different domain than "your-site.com", given that they show the www.iaea.org referrer. In other words, the pattern probably doesn't do what its author had in mind.

Reality, however, is slightly different. ;) The string passed to the RewriteRule only contains the path component of the URL without the hostname. This is the reason why the technically pointless pattern still gives the desired result and simply denies any request where the RewriteCond matches. The rule will by definition never see a string that starts with "http://", but only strings that start with a "/".

If in doubt, I'd simply lump the RewriteCond for iaea together with the others in the upper list and get rid of the second RewriteRule. The "^.*" of the first RewriteRule acheives the same result in a much simpler was, by saying "apply this rule to URLs that contain any sequence of characters, or none".

12:00 am on Jan 12, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



toolman
i found a few new UAs in my logs for the last couple of months. don't know much about them but you might like to keep an eye on them in case they are pests. i've posted the list in the spider identification forum at [webmasterworld.com...]
11:00 am on Mar 5, 2002 (gmt 0)

10+ Year Member



Hi all, this is my first post, and it is a question...

I still don't get it. Do I have to replace "your-site.com" and/or "http://www.iaea.org" with my actual URL or do I leave this as it is?

This is a snippet of the code:
RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]

I hope I will be able to deliver some solutions to other topics in return soon, as I am mostly a designer and quite good in X/HTML and CSS, rather than in programming and server technologies.

So I'd be happy if anyone could blow away the fog

9:21 am on Mar 6, 2002 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Leave it as is. That iaea referrer is part of some abusive bot that we've all banned. It uses iaea as a referrer. You will find it coming in from all kinds of ip's in south east asia - easiest to ban the referrer.
2:52 pm on Mar 6, 2002 (gmt 0)

10+ Year Member



RewriteRule !^http://[^/.]\.your-site.com.* - [F]

Thanx for the info. Still I am not quite sure about the quoted line above. Do I leave this also as it is or do I replace "your-site.com" with my actual URL? Sorry if it sounds like I'm stupid...

2:55 pm on Mar 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



your-site.com is replaced by your domain name.

Welcome to WmW

2:57 pm on Mar 6, 2002 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Yep, I see - that was a 2 parter. Thanks Gethan.
3:32 pm on Mar 6, 2002 (gmt 0)

10+ Year Member



Okay, got that. Thanx for the welcome. I do feel better now as I have my first of what I hope to be more useful posts in the "Browsers, HTML, and Web Page Design"-forum.
5:27 am on Mar 19, 2002 (gmt 0)

5+ Year Member


I see most people send their bots to a 403 error page, but since I am more concerned with the so called "Offline Browsers" or "Sitegrabbers," I redirect them to a gay site that used to SPAM me ...

Here is my current .htaccess file with all the Offline Browsers I've come across so far ...

[small]RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZip [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\Image\Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon
RewriteRule /*$ http://www.yourdomain.com [L,R][/small]

The RewriteRule line can be changed to send the bot to any site you want ...

Feel free to copy it or give me suggestions if there is anything I need to add or remove ...

5:32 am on Mar 19, 2002 (gmt 0)

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



OK I see you are all keen on your HT Access files

I sooooo want to copy and paste what you have all wrote so far......

"where" does a .htaccess file GO? I dont run my own server, but wouldnt mind getting up to scratch, I recognise some of those user agents from my stats

7:17 am on Mar 19, 2002 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



>"where" does a .htaccess file GO?

The file is created as a text file. Name it htaccess.txt and upload to your root directory. Then use your FTP client to rename it .htaccess (notice it starts with a dot)

9:14 am on Mar 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



brotherhood_of_LAN - you will need to be running Apache with mod_rewrite enabled to benefit from any of this code. HTH
4:27 pm on Mar 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I highly recommend anyone trying to implement something like this to read up on regular expressions, and to study the mod_rewrite documentation very carefully. I see lots of advice and many examples here that will never work, often repeated even after having been corrected. Both topics are not trivial at all, and putting up incorrect rewriting rules may do your site more harm than putting up correct ones will help it. One thing to remember is not to confuse regular experssions with shell wildcards. Those two things work very differently, even if they serve similar purposes.

Just a few examples:

RewriteCond %{HTTP_USER_AGENT} ^Offline\Explorer [OR]

The "\E" sequence is meaningless. What was probably meant is "\ E", with a space between the backslash and the E. The format of the RewriteCond entries is whitespace delimited. This means, that if your pattern includes any whitespace, then you need to escape that. The sequence "\ " (backslash-space) does exactly this, and avoids the normal "end of pattern" meaning of the whitespace.

RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]

The "^" always matches the beginning of the search string. So this rule matches any UA that starts with "Siphon...". However, the real UA that you want to catch here starts with "EmailSiphon...", which will not get caught with the above pattern. In short, if you want to match a substring out of the middle of the UA string, don't use the "^".

RewriteCond %{HTTP_USER_AGENT} ^NetZip [OR]

The UA that you want to catch here is "NetZIP" (or at least you also want to catch that one). However, in the normal case, the RewriteCond pattern will perform a case-sensitive match. If you want to get case-insensitive matches, use the NoCase flag: [NC,OR] instead of just [OR]

RewriteRule /*$ [yourdomain.com...] [L,R]

The "/*" sequence has the meaning of "zero or more slashes". Is this really what was intended? The correct pattern for this situation has been outlined several times in this thread.

And finally, I'd like once again to emphasize the most important advice that I can give in this context: Don't use any rewrite rules on your site that you don't understand yourself in all their consequences. Mod_rewrite is a very powerful tool, but also a very dangerous one.

9:54 pm on Mar 19, 2002 (gmt 0)

5+ Year Member



Bird,

Nice catch on the \space ... you are right of course. I converted it from having a . in place of the whitespace and left out the trailing space after the \

As far as the rewrite ... this format works perfectly. It redirects the Offline Browser to the new page every time ... maybe it is technically incorrect, but it does what it is intended to.

The /* is used in every RewriteRule I've ever seen when redirecting to another site, so it must be there for a reason? ... the $ is sometimes left out. I don't know the technicalities of it all, but I know it works.

Perhaps the confusion is in the fact that the page I was redirecting to was changed by the moderator ... it should not be "www.yoursite.com", it should be "www.site-you-are-sending-the-bot-to.com"

I've tested this .htaccess with many of the Offline Browsers on the list. For the ones I did not download and install, I tested the UA name in Teleport Pro's Agent spoofer field .. It blocks and redirects as advertised.

Obviously the list of UA's can be modified/replaced with whatever Agent's you wish ... these are just the one's that have shown up in my logs.

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZip [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon
RewriteRule /*$ [site-you-are-sending-the-bot-to.com...] [L,R]

This 243 message thread spans 9 pages: 243
 

Featured Threads

Hot Threads This Week

Hot Threads This Month