Forum Moderators: open
I found (in my logs) another web downloader... something more to add to that growing list of denied agents. It claims to be:
an advanced Web search agent offering the power to extract information off the Web.
In my logs, the UA came up: netattache 112, and a simple serach on GG will show you all you need to know! Seems to be a variant (with a different UA) than Go!Zilla Plus
your ever-watchful log-spy,
dave
Yes, that would certainly work. I just prefer to block very selectively, and then widen the net only as required. If there are other "bad" User-agents which contain "attach" and if all agents which contain "attach" are bad, then your unanchored pattern is the ideal solution.
Since I was quoting only the pattern, I left off the RewriteCond statement and the [OR] and no-case flags.
Jim
By the by, been meaning to mention to you... rather than .htaccess or even thr RewriteRules, I have been using Apache::BlockAgent, and a version I modified into Apache::BlockIP. The advantages are great:
1) Only one, central file to update will cover all VH's
2) You can use (your program, I think!) robot trap file, and have it right to the file, so it cates it quick, anbd covers all domains
3) You do not have to restart your server
Downsides:
1) Have to have mod_perl running
2) Gotta know what you are doing (probably not a problem for you!)
3) you HAVE to remember to upload the files in ASCII or your server goes down!
Anyway, the BlockAgent files on the web all seem to have minor to major perl code problems.... if you want to know what I am doing in more depth, get the files, etc, Sticky me!
dave
Well, I'm flattered, but actually, I noticed the UA variants after doing a web search as you suggested, and examining one of the pages that came up in the results - The first one that was not somebody's open(!) server log had several variants of "Attache" listed in a very readable format.
You're ahead of me on the VH-wide blocking - I haven't yet made the move to a non-shared hosting environment, so I'm stuck with .htaccess for now. I've got almost all AllowOverrides, but they won't let me touch the server config stuff, and that's probably for the best... You can call me "Mr. Servercode 500"! ;)
So, now it looks like you're the expert on BlockIP for virtual hosting around around here! That's one reason I like this place: Many people are expert at a few things, and a few people are expert at many things. I'm one of the former group, not the latter! But that's cool, because everyone has something to contribute, even if it's "just" a really, really good question from a new member.
BTW, someone here was looking for a blocking method that would work for all virtual hosts, and it wasn't that long ago. The conclusion of that thread was that there was no way to do it with rewrites. The original poster might be very interested in your script! If I can remember the context of that thread, I'll search for it and then go update it with a link to this one.
Holy cow! - Just passed 500 posts. I'd better start working harder!
Jim
That poster was me! And I figured it out- with a LOT of help... you, Andreas... etc.
You are right, that is what is great about this forum- you may not know, but someone does. Just trying to pay you back a bit!
Share the knowledge!
Thanks!
dave
[edited by: carfac at 12:31 am (utc) on Sep. 30, 2002]
have a nice day,
--jan
If understand the problem correctly, this is indeed possible with Mod_Rewrite. The solution is to wrap the Mod_Rewrite directives into a
global Apache <Directory /> statement, ie. as follows (using a very fast binary RewriteMap):
RewriteEngine on
RewriteMap spider dbm:/home/bots
<Directory />
RewriteEngine on
RewriteBase /
RewriteCond ${spider:%{REMOTE_ADDR}} ^1$
RewriteRule ^.*$ - [F,L]
</Directory>
If there are individual Rewrite rules following on in certain Virtual Hosts and the global Rewrite rules should stay valid, it is crucial to add the statement "RewriteOptions inherit" to each of these Virtual Hosts, right after the "RewriteEngine On" declarations.