Forum Moderators: phranque

Message Too Old, No Replies

bot banning is there a simple way

maybe a short easy way

         

ogletree

7:30 am on Jul 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have read a bunch of the close to perfect .htaccess threads and they are way over my head. I don't want to hire anybody and I don't want to paste that thing in my .htaccess file without knowing what it does. I would just like to ban the most egregous offenders. I saw the other day that webstripper thing clobbered my server. Could somebody come up with something us non techie people could use to just paste in our .htaccess files that just stops the most well known easy to spot bots by user agent. I know that won't stop a lot of them but I just want to stop people that don't know what they are doing. Kind of like the club it does not stop pros but keeps amatures from trying.

All that cool stuff those guys are doing is neat but is just not worth the time and expense for me to figure out or hire somebody to do.

jdMorgan

3:07 pm on Jul 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I don't want to paste that thing in my .htaccess file without knowing what it does.

That is a wise approach.

> Could somebody come up with something us non techie people could use to just paste in our .htaccess files that just stops the most well known easy to spot bots by user agent?

The problem is not one of techie vs. non-techie. It's one of identifying "bad bots" and "the most egregious offenders" for your site(s). My list of the worst bad ones may be ineffective on your site, because yours attracts a different mix of bots.

The first step is to identify *your* problem bots, and then use the code snippets in that thread as examples to build your own code. Reviewing your stats should give you a good idea of which bots are the most trouble for you. Maybe a small derivative example will be more helpful than the huge lists in that thread:


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} RPT.HTTPClient/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper/
RewriteRule .* - [F]

If you use a custom 403 error page, change the RewriteRule to:

RewriteRule !^path_to_403_error_page\.html$ - [F]

There is no one-size-fits-all solution, and if the above code does not work or is not perfect for your sites, then you'll have to study up and fix it, or hire someone to do it. I would suggest the former approach as most cost-effective for small companies.

Apache mod_rewrite documentation [httpd.apache.org]
Apache URL Rewriting Guide [httpd.apache.org]
Regular Expressions Tutorial [etext.lib.virginia.edu]

Jim

ogletree

5:53 pm on Jul 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So what am I saying when I put ^WebStripper/

I have another one that says Web Downloader/6.1 would I just put ^Web Downloader/

Man there are a bunch of these things. I run a successful adsense site and I think people are trying to copy me.

jdMorgan

7:16 pm on Jul 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Please see the links above - The regular expressions tutorial will be helpful.