homepage Welcome to WebmasterWorld Guest from 54.237.134.62
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
shelob v1.0 - No robots.txt
From Juniper Networks
jdMorgan




msg:3366527
 4:41 pm on Jun 13, 2007 (gmt 0)

Saw this:

208.223.208.*** - - [13/Jun/2007:11:29:26 -0500] "GET / HTTP/1.0" 403 666 "-" "shelob v1.0"

IP address resolves to a research facility belonging to networking equipment maker Juniper Networks.

Two things their researchers should take note of: The robots.txt standard, and the fact that Shelob was an evil spider... at least according to Tolkien.

For non-compliant spiders, no tasty hobbitses to eat here, only 403s. :(

Jim

[edited by: volatilegx at 11:11 pm (utc) on June 13, 2007]
[edit reason] obfuscated ip address [/edit]

 

keyplyr




msg:3368917
 6:54 pm on Jun 15, 2007 (gmt 0)

I've had it banned for nearly two years. It musta done something wrong :)

incrediBILL




msg:3369110
 10:00 pm on Jun 15, 2007 (gmt 0)

First contact with it was on 6/2/07 daily thru 6/15/07 so far. It hails from security-lab1.juniper.net which has just encountered my security lab that has been feeding it garbage pages since first encounter.

fiestagirl




msg:3370345
 7:00 pm on Jun 17, 2007 (gmt 0)

According to my records these guys lost their privileges in 4/2006, after scraping with a Python UA.

208.223.208.***
python-urllib/1.16

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved