Forum Moderators: open
# Ay-Up
# UA "Fred/0.01-dev (Fred; [ay-up.com;...] fred@ay-up.com)"
69.57.157.54
# Nusearch
# UA "NuSearch Spider www.nusearch.com"
82.68.206.22
# Peerbot.com
# UA "PEERbot www.peerbot.com"
213.239.197.150
213.239.206.109
# Terrawiz
# UA "TerrawizBot/1.0 (+http://www.terrawiz.com/bot.html)"
24.6.176.192
# uk-searcher.co.uk
# UA "uk-Searcher(HTTP://WWW.UK-SEARCHER.CO.UK)"
81.27.96.248
I've had a problem for some time with numerous deep-links to my my favicon and referrals from IconSurf. I recently removed my favicon and within a couple of days the Iconsurf surf bot paid a visit "http ://iconsurf.com/" "IconSurf/2.0 favicon monitor (see [iconsurf.com...]
I've had a Indy Library user catch a 403 and change to another UA with the same time (to the second) in a consecutive hit.
216.0.****.x (Anybody desire the full-ip, sticky me.)
Regarding Co-location servers: There are more and more of these crawling out of the cracks. Many facilities are mixing their services to offers normal internet service (dial-up, broadband, T1)with hosting and co-location. It seems a natural utilization of their computers however makes identification or new rogues more difficult.
The IP ranges fell under a Level 3 range which I had denied long ago. 64.152.xx.xx
This IAR article appears to be related
http ://www.clickz.com/news/article.php/3387971
The article and concept of the article may have good intent, it remains however to me still another 3rd party use for websites.
I don't think he's specifically denying access to robots.txt.
Gary,
Your are correct. I have the majority of RIPE denied access. (It should be noted that is my personal choice and something that I would not desire to influence others to follow suit.)
I was provided with a solution to allow reading of robots when a range or UA is denied however for some reason the entry fails in my htaccess.
My htaccess is quite extensive with a very small amount of redirects and even though the courtesy of providing bots access to robots.txt is desireable it is not a personal agenda of my sites. Rather my agenda is keeping the desired bots and/or visitors out.
As has been stated many times, my preferences are quite over bearing and not applicable to the majority of websites.
Each webmaster must decide what is beneficial or detrimental to their own website.
63.200.38.186 - - [23/Aug/2004:10:08:28 -0700] "GET /robots.txt HTTP/1.1"
200 2599 "-" "ScSpider/0.2"
I haven't had anybody from Pac Bell bothering me in quite a while.
Guess they love me again ;)
24.248.168.184 - - [28/Aug/2004:11:28:14 -0700] "GET / HTTP/1.1" 200 9690
"www.av.com" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Main page
66.139.77.92 - - [28/Aug/2004:22:00:02 -0700] "GET / HTTP/1.1" 200 9690 "-"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Main page and 2nd level links
67.138.247.2 - - [29/Aug/2004:05:31:48 -0700] "GET / HTTP/1.1" 200 7326 "-"
"Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"