Welcome to WebmasterWorld Guest from 54.205.20.160

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

What is Indy library and can I get them to go away

   
5:38 pm on Jun 10, 2002 (gmt 0)

WebmasterWorld Senior Member korkus2000 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Mozilla/3.0 (compatible; Indy Library) this hits all my servers all the time. What is it and does it respect robots.txt? This thing is a bandwidth hog!

I found on another site that it is "Internet Direct Library for Borland (used as E-Mail collector)." So what can I do opposed to banning the ip?

[edited by: korkus2000 at 5:41 pm (utc) on June 10, 2002]

5:41 pm on Jun 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Lemme guess...it's coming from an ip in Asia and it voraciously eats every thing it can find on the server?

[webmasterworld.com...] this is the best answer to that problem that I know of.

3:32 am on Jun 11, 2002 (gmt 0)

10+ Year Member



toolman, does adding this to an .htaccess file block Indy Library -- that is, is the ^Mozilla.*Indy part correct?

SetEnvIf User-Agent ^Mozilla.*Indy keep_out
order allow,deny
allow from all
deny from env=keep_out

korkus2000, below is explicitly the answer you don't want. It has worked for me so far, however. And I have never had any requests from these two IP addresses that did not involve Indy Library (I watch my access logs for 403s very carefully on this account).

deny from 210.82.
deny from 211.101.

Both of these are Mozilla/3.0 (compatible; Indy Library) from Beijing, China

5:53 am on Jun 13, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>toolman, does adding this to an .htaccess file block Indy Library -- that is, is the ^Mozilla.*Indy part correct?

Ahhh. You're asking the wrong dude. I just collected a bunch of ua's from other threads and stuck them together. Littleman or Air would be the ones to help on mod_rewrite questions.

6:19 am on Jun 13, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



misosoph,

This will ban that agent. No need for a Mozilla prefix. This will ban any agent that contains the phrase Indy Library, in both upper and/or lower case (I like to be on the safe side).

SetEnvIfNoCase User-Agent "indy library" keep_out

7:21 am on Jun 13, 2002 (gmt 0)

10+ Year Member



Thank you. The part about NoCase is especially useful to me.

(I never know whether to write "Thank you" notes. On the one hand, no one learns anything from reading them. And on the other hand, it looks as if you are unappreciative or ignoring the responder if you don't write one. So what is to be done?)

10:42 am on Jul 14, 2002 (gmt 0)

10+ Year Member



For the record: I wrote above that I had never received a request from IP address 210.82. that did not have Indy Library as the user agent. That is no longer true:

210.82.42.242 - - [13/Jul/2002:23:54:36 -0700] "GET /folder/filename.html HTTP/1.1" 403 302 "http://www.google.com/search?hl=iw&inlang=iw&ie=ISO-8859-8-I&q=searchword+searchword+searchword+searchword&btnG=%E7%E9%F4%E5%F9+%E1%E2%E5%E2%EC" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"

APNIC Whois says 210.82.0.0 - 210.82.127.255 is registered to the "beijing branch" of china-netcom.com

I forgot to remove " deny from 210.82. " from my .htaccess file when I added " SetEnvIfNoCase User-Agent "indy library" keep_out ", and now I've blocked someone through carelessness.

12:47 pm on Jul 14, 2002 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



misosph,
I've been using 210.82.124. for that Indy which has been effective. At least between that and the other indy #.

If it's any comfort?
We are after all only human!

The visitor was looking for the frequency of keywords. Although they specified "searchword."
Keywords with only a few exceptions are ineffective with todays SE's. Though that doesn't stop me from creating them off the content of each page.

1:20 pm on Jul 14, 2002 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



It seems as though the Indy Library user monitors this group :-(

210.82.124.87 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
210.82.124.85 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
210.82.124.86 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"
211.101.236.91 - - [14/Jul/2002:06:06:09 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/3.0 (compatible; Indy Library)"

1:54 pm on Jul 14, 2002 (gmt 0)

10+ Year Member



Thank you, wilderness.

Maybe I mislead by substituting "searchword" for the actual words? It was a real search, of the form "map+of+north+dakota".

<quote> We are after all only human! </quote> Are you sure? Remember, this is the Internet. I might be a human or I might not be. :) But thank you for the thought!

2:03 pm on Jul 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I get Indy Library requests from addresses all around the world, so I would assume that it is pointless to block them by IP. You will only hurt countless innocent bystanders if you do this.

Since the address harvesting tool using that library doesn't seem to be written with the ability to change the UA, blocking that looks like the preferrable method.

2:37 pm on Jul 14, 2002 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



<snip>I might be a human or I might not be.>

Misosoph,
The computer industry for some decades has been trying to inject a "human personal instict" in computers.
Although they have come along way in analyzing situations nothing replaces or even compares to the logic and feeling of another mind and heart.
Unless it's a similar weak mind or heart ;-)[TIC]

hey bird,
As I've made clear on more than one occassion! The methods I use are specific for my sites and should be determined by the market each website serves.
EX:
This past week I put up an equine sale in Michigan online. Yesterday a visitor from Yugoslavia wasted much unneccessary bandwidth by viewing every pedigree (family tree) for a Michigan Sale. 403.
I did however confine my 403 to xxx.xxx.xxx. which somewhat limits innocence.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month