Lemme guess...it's coming from an ip in Asia and it voraciously eats every thing it can find on the server?
[webmasterworld.com...] this is the best answer to that problem that I know of.
toolman, does adding this to an .htaccess file block Indy Library -- that is, is the ^Mozilla.*Indy part correct?
SetEnvIf User-Agent ^Mozilla.*Indy keep_out
allow from all
deny from env=keep_out
korkus2000, below is explicitly the answer you don't want. It has worked for me so far, however. And I have never had any requests from these two IP addresses that did not involve Indy Library (I watch my access logs for 403s very carefully on this account).
deny from 210.82.
deny from 211.101.
Both of these are Mozilla/3.0 (compatible; Indy Library) from Beijing, China
>>>toolman, does adding this to an .htaccess file block Indy Library -- that is, is the ^Mozilla.*Indy part correct?
Ahhh. You're asking the wrong dude. I just collected a bunch of ua's from other threads and stuck them together. Littleman or Air would be the ones to help on mod_rewrite questions.
This will ban that agent. No need for a Mozilla prefix. This will ban any agent that contains the phrase Indy Library, in both upper and/or lower case (I like to be on the safe side).
SetEnvIfNoCase User-Agent "indy library" keep_out
Thank you. The part about NoCase is especially useful to me.
(I never know whether to write "Thank you" notes. On the one hand, no one learns anything from reading them. And on the other hand, it looks as if you are unappreciative or ignoring the responder if you don't write one. So what is to be done?)
For the record: I wrote above that I had never received a request from IP address 210.82. that did not have Indy Library as the user agent. That is no longer true:
220.127.116.11 - - [13/Jul/2002:23:54:36 -0700] "GET /folder/filename.html HTTP/1.1" 403 302 "http://www.google.com/search?hl=iw&inlang=iw&ie=ISO-8859-8-I&q=searchword+searchword+searchword+searchword&btnG=%E7%E9%F4%E5%F9+%E1%E2%E5%E2%EC" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
APNIC Whois says 18.104.22.168 - 22.214.171.124 is registered to the "beijing branch" of china-netcom.com
I forgot to remove " deny from 210.82. " from my .htaccess file when I added " SetEnvIfNoCase User-Agent "indy library" keep_out ", and now I've blocked someone through carelessness.
I've been using 210.82.124. for that Indy which has been effective. At least between that and the other indy #.
If it's any comfort?
We are after all only human!
The visitor was looking for the frequency of keywords. Although they specified "searchword."
Keywords with only a few exceptions are ineffective with todays SE's. Though that doesn't stop me from creating them off the content of each page.
It seems as though the Indy Library user monitors this group :-(
126.96.36.199 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
188.8.131.52 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
184.108.40.206 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
220.127.116.11 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
Thank you, wilderness.
Maybe I mislead by substituting "searchword" for the actual words? It was a real search, of the form "map+of+north+dakota".
<quote> We are after all only human! </quote> Are you sure? Remember, this is the Internet. I might be a human or I might not be. :) But thank you for the thought!
I get Indy Library requests from addresses all around the world, so I would assume that it is pointless to block them by IP. You will only hurt countless innocent bystanders if you do this.
Since the address harvesting tool using that library doesn't seem to be written with the ability to change the UA, blocking that looks like the preferrable method.
<snip>I might be a human or I might not be.>
The computer industry for some decades has been trying to inject a "human personal instict" in computers.
Although they have come along way in analyzing situations nothing replaces or even compares to the logic and feeling of another mind and heart.
Unless it's a similar weak mind or heart ;-)[TIC]
As I've made clear on more than one occassion! The methods I use are specific for my sites and should be determined by the market each website serves.
This past week I put up an equine sale in Michigan online. Yesterday a visitor from Yugoslavia wasted much unneccessary bandwidth by viewing every pedigree (family tree) for a Michigan Sale. 403.
I did however confine my 403 to xxx.xxx.xxx. which somewhat limits innocence.