Forum Moderators: open
67.228.nnn.nn - - [24/Sep/2011:17:52:25 -0700] "GET /rats/images/Yummy.jpg HTTP/1.1" 200 22814 "-" "Mozilla/5.0+(compatible;+PiplBot;++http://www.pipl.com/bot/)" PiplBot is Pipl's web-indexing robot. PiplBot crawler collects documents from the Web to build a searchable index for our People Search engine.
Unlike a typical search-engine robots, PiplBot is designed to retrieve information from the deep web [pipl.com]; our robots are set to interact with searchable databases and not only follow links from other websites.
As part of the crawling, PiplBot takes robots.txt standards into account to ensure we do not crawl and index content from those pages whose content you do not want included in Pipl Search.
the term "deep web" refers to a vast repository of underlying content, such as documents in online databases that general-purpose web crawlers cannot reach. The deep web content is estimated at 500 times that of the surface web, yet has remained mostly untapped due to the limitations of traditional search engines.
example/1.0
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;
snprtz|S04727582701828#1828|isdn; .NET CLR 2.0.50727)
your line "Deny from 67.228.0.0/15" - I have no record of anything odd coming from 67.229.0.0/16, which does not appear to be softlayer anyway?
What about people who host servers there, like us regular webmaster folks?
[edited by: Mokita at 7:48 pm (utc) on Dec 1, 2011]
With 394 hits in 50 seconds I would put it at the top of the ### list ... if it weren't for its mind-boggling, over-the-top, jaw-dropping, have-to-see-it-to-believe-it stupidity.