Forum Moderators: open

Message Too Old, No Replies

Spinn3r RSS Aggregator

Spins out of control from Serverbeach range

         

caribguy

10:22 pm on Jan 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



example.com 64.34.195.aaa - - [31/Dec/2008:03:24:32 -0600] "GET / HTTP/1.1" 301 - "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Tailrank (Spinn3r 2.3); http://spinn3r.com/robot) Gecko/20021130"
www.example.com 64.34.195.aaa - - [31/Dec/2008:03:24:32 -0600] "GET / HTTP/1.1" 200 64933 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Tailrank (Spinn3r 2.3); http://spinn3r.com/robot) Gecko/20021130"
example.com 64.34.195.bbb - - [31/Dec/2008:03:26:40 -0600] "GET / HTTP/1.1" 301 - "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"
www.example.com 64.34.195.bbb - - [31/Dec/2008:03:26:40 -0600] "GET / HTTP/1.1" 200 66704 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"
www.example.com 64.34.195.bbb - - [31/Dec/2008:03:26:41 -0600] "GET /w3c/p3p.xml HTTP/1.1" 200 293 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"
example.com 64.34.195.bbb - - [31/Dec/2008:03:26:41 -0600] "GET /RSS HTTP/1.1" 301 1101 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"
www.example.com 64.34.195.bbb - - [31/Dec/2008:03:26:41 -0600] "GET /RSS HTTP/1.1" 200 15662 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"
example.com 64.34.195.ccc - - [31/Dec/2008:03:28:45 -0600] "GET /RSS HTTP/1.1" 301 1101 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"
www.example.com 64.34.195.ccc - - [31/Dec/2008:03:28:45 -0600] "GET /RSS HTTP/1.1" 200 15662 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Spinn3r (Spinn3r 3.0); http://spinn3r.com/robot) Gecko/20021130"

Reason #1 for banning

Spinn3r is a web service that crawls on behalf of dozens of companies, researchers, and web startups.

Basically if you're indexing the blogosphere then you should probably be using Spinn3r. We provide raw access to every blog post being published - in real time. We provide the data and you can focus on building your application / mashup.

Reason #2 for banning

Spinn3r is indexing your site on behalf of our user base to provide your content so that it can influence their applications. We're used by search engines, analytic services, competitive intelligence services, etc.

Reason #3 for banning, incorrect statements: does not read robots.txt, nor seem very familiar with 301 redirects, and their "cache" mechanism looks very much broken. Oh, and p3p.xml is not a RSS feed :)

Spinn3r uses very little bandwidth to monitor your site. We only request pages once and cache them once we've fetched them.

keyplyr

7:19 am on Jan 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Despite getting 403s, Spinn3 and Tailrank have been hitting my sites about a dozen times daily for months. I've sent them emails informing them that I do not have forums/blogs/etc and to please remove my domains from their crawl list, but my requests have so far been ignored.

Very bad behavior. Seems some companies feel they need to bully their way in, instead of earning their place in the market.