Welcome to WebmasterWorld Guest from 188.8.131.52 , register , free tools , login , search , pro membership , help , library , announcements , recent posts , open posts Pubcon Platinum Sponsor 2014
Twitter Chasing Bots How many bots are chasing your tweets? incrediBILL
You think people are really reading those tweets of yours? Maybe, or maybe not, but there's a whole herd of bots that jump on them right away! Here's just the few things that followed a link back to my site within a few hours after tweeting a link.
184.73.85.* "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:184.108.40.206) Gecko/20070914 Firefox/220.127.116.11" From the Amazon AWS
67.202.7.* "HEAD /..." "PycURL/7.18.2" Something else from the Amazon AWS validating URIs with a HEAD request
18.104.22.168 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Et Tu Googlebot?
217.144.236.* "HEAD /..." 301 252 "-" "ytndemo firstname.lastname@example.org" Something from Yahoo validating the URIs with a HEAD request
128.242.241.* "HEAD /..." "Twitterbot/0.1" It appears Twitter actually checks the URIs with their own HEAD request
38.113.234.* "Voyager/1.0" Good old Voyager poking around
204.236.153.* "HEAD /..." "JS-Kit URL Resolver, [ ...] js-kit.com Guess what, it's checking the URI too...
85.114.136.* "Mozilla/5.0 (compatible; Windows NT 6.0) Gecko/20090624 Firefox/3.5 NjuiceBot" Pfui posted about the NjuiceBot [ ...] webmasterworld.com
216.24.142.* "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:22.214.171.124) Gecko/20091221 Firefox/3.5.7 OneRiot/1.0 (http://www.oneriot.com)" Another social parasite
89.151.116.* "Mozilla/5.0 (compatible; MSIE 6.0b; Windows NT 5.0) Gecko/2009011913 Firefox/3.0.6 TweetmemeBot" And another social parasite
65.52.29.* "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)" Something from MS...
70.37.65.* "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)" Something else from MS...
64.13.147.* "Mozilla/5.0 (compatible; abby/1.0; +http://www.ellerdale.com/crawler.html)" Another social leech from SVCOLO
174.129.151.* "HEAD /... " "@hourlypress" Yet another AWS process checking URIs
184.73.204.* "HEAD /..." "Firefox" Even more crap from Amazon AWS, yeah right, Firefox <snort>
67.202.5.* "kame-rt (email@example.com)" Yet even more junk using Amazon AWS, it just keeps coming
74.112.128.* "Mozilla/5.0 (compatible; Butterfly/1.0; +http://labs.topsy.com/butterfly.html) Gecko/2009032608 Firefox/3.0.8" Tweet powered search engine? oh gag...
173.13.167.* "Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:126.96.36.199) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729)" Something using a comcast business connection
174.129.119.* "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:188.8.131.52) Gecko/20070914 Firefox/184.108.40.206" More things from the septic tank of Amazon AWS This is probably just the tip of the iceberg collected over just a few hours. Who knows just how much junk is really chasing your tweets but ENOUGH already with all the leeches. Sheesh
And? I know MS uses twitter to do geo-location on bing maps as well as integrate within bing search.. i'm sure others do the same. Whats with all the amazon hate? some popular stuff runs on amazon ec2 from my reddit addiction to my wife's four square habits :)
just fyi: Amazon runs the backend of Twitter and bit.ly.
@ ByronM [ ...] webmasterworld.com jmccormac
@ByronM The web has changed a lot since this discussion: [ ...] and many of the AWS maggots seem to be oblivious to the existence of robots.txt. Now when you run a small website, a few pages here or there is very little. But when you run a large website with thousands or millions of webpages and the operators of some of these maggots decide to download the entire site, it is a big problem. webmasterworld.com Regards...jmcc g1smd
Your first two to four visitors following any bit.ly or tiny.cc link that you post to Twitter will almost certainly be bots of some sort - often arriving within tens of seconds after posting the link. londrum
isn't that a good thing though? a big chunk of the webmasters who use twitter just automate it all anyway, i know i do. i tie my rss feeds to it. there's not much point doing that if the bots don't lap it up. keyplyr
Most Twitter parasites I block by IP range, a couple by UA and I allow several that benefit me; just like every thing else online, it's a case by case thing.
174. and 173. ranges are pretty new, i dont mean AWS, but are on the *&$% list to start with. I am looking at 50+++ sites report at this point and it ian't pretty. Sgt_Kickaxe
I christen these new findings "Recursive Twitter Disease", very dangerous to a website without a healthy immune system. Question is, now that we've got a disease... is there a cure that DOESN'T involve cutting something off?