homepage Welcome to WebmasterWorld Guest from 54.211.47.170
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Some more spiders
Shri




msg:397730
 6:44 am on Mar 13, 2000 (gmt 0)

Anyone want to shed some light on this beast called

DIIBot??

Shows up from 216.233.51.149 which is identified as

Digital Integrity (NETBLK-RNCI-DIGITAL-INT)
2121 South El Camino Real
San Mateo, CA 94403
USA

Netname: RNCI-DIGITAL-INT
Netblock: 216.233.51.144 - 216.233.51.151

Coordinator:
Hostmaster, Rhythms NetConnections (RNH2-ARIN) hostmaster@RHYTHMS.NET
(303) 476-4200

It did a DEEEEEP crawl of my site (luckily I had a tail -f running) and keeps comming back for the last three or four days, even after I banned it using .htaccess.

 

Brett_Tabke




msg:397731
 2:35 pm on Mar 30, 2000 (gmt 0)

I don't recognize that one at all Shri.

How about a quick totorial on banning a particular host via htaccess? I get that question quite often and have never given a pat reply.

Shri




msg:397732
 2:44 pm on Mar 30, 2000 (gmt 0)

Set this in your .htaccess

--- From the Apache Directives ----

deny directive
Syntax: deny from host host ...
Context: directory, .htaccess
Override: Limit
Status: Base
Module: mod_access

The deny directive affects which hosts can access a given directory. Host is one of the following:

all
all hosts are denied access
A (partial) domain-name
host whose name is, or ends in, this string are denied access.
A full IP address
An IP address of a host denied access
A partial IP address
The first 1 to 3 bytes of an IP address, for subnet restriction.
A network/netmask pair (Apache 1.3 and later)
A network a.b.c.d, and a netmask w.x.y.z. For more fine-grained subnet restriction. (i.e., 10.1.0.0/255.255.0.0)
A network/nnn CIDR specification (Apache 1.3 and later)
Similar to the previous case, except the netmask consists of nnn high-order 1 bits. (i.e., 10.1.0.0/16 is the same as 10.1.0.0/255.255.0.0)
Example:

deny from 16
All hosts in the specified network are denied access.

Note that this compares whole components; bar.edu would not match foobar.edu.

See also allow and order.

Syntax: deny from env=variablename
Context: directory, .htaccess
Override: Limit
Status: Base
Module: mod_access
Compatibility: Apache 1.2 and above

The deny from env directive controls access to a directory by the existence (or non-existence) of an environment variable.

Example:

BrowserMatch ^BadRobot/0.9 go_away
<Directory /docroot>
order allow,deny
allow from all
deny from env=go_away
</Directory>

In this case browsers with the user-agent string BadRobot/0.9 will be denied access, and all others will be allowed.
See also allow from env and order.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved