homepage Welcome to WebmasterWorld Guest from 54.145.183.190
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Twitter Chasing Bots
How many bots are chasing your tweets?
incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4129109 posted 1:32 am on May 8, 2010 (gmt 0)

You think people are really reading those tweets of yours?

Maybe, or maybe not, but there's a whole herd of bots that jump on them right away!

Here's just the few things that followed a link back to my site within a few hours after tweeting a link.

184.73.85.* "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7"

From the Amazon AWS

67.202.7.* "HEAD /..." "PycURL/7.18.2"

Something else from the Amazon AWS validating URIs with a HEAD request

66.249.65.113 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Et Tu Googlebot?

217.144.236.* "HEAD /..." 301 252 "-" "ytndemo bergum@yahoo-inc.com"

Something from Yahoo validating the URIs with a HEAD request

128.242.241.* "HEAD /..." "Twitterbot/0.1"

It appears Twitter actually checks the URIs with their own HEAD request

38.113.234.* "Voyager/1.0"

Good old Voyager poking around

204.236.153.* "HEAD /..." "JS-Kit URL Resolver, [js-kit.com...]

Guess what, it's checking the URI too...

85.114.136.* "Mozilla/5.0 (compatible; Windows NT 6.0) Gecko/20090624 Firefox/3.5 NjuiceBot"

Pfui posted about the NjuiceBot
[webmasterworld.com...]

216.24.142.* "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 OneRiot/1.0 (http://www.oneriot.com)"

Another social parasite

89.151.116.* "Mozilla/5.0 (compatible; MSIE 6.0b; Windows NT 5.0) Gecko/2009011913 Firefox/3.0.6 TweetmemeBot"

And another social parasite

65.52.29.* "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"

Something from MS...

70.37.65.* "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"

Something else from MS...

64.13.147.* "Mozilla/5.0 (compatible; abby/1.0; +http://www.ellerdale.com/crawler.html)"

Another social leech from SVCOLO

174.129.151.* "HEAD /... " "@hourlypress"

Yet another AWS process checking URIs

184.73.204.* "HEAD /..." "Firefox"

Even more crap from Amazon AWS, yeah right, Firefox <snort>

67.202.5.* "kame-rt (support@backtype.com)"

Yet even more junk using Amazon AWS, it just keeps coming

74.112.128.* "Mozilla/5.0 (compatible; Butterfly/1.0; +http://labs.topsy.com/butterfly.html) Gecko/2009032608 Firefox/3.0.8"

Tweet powered search engine? oh gag...

173.13.167.* "Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729)"

Something using a comcast business connection

174.129.119.* "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7"

More things from the septic tank of Amazon AWS


This is probably just the tip of the iceberg collected over just a few hours.

Who knows just how much junk is really chasing your tweets but ENOUGH already with all the leeches.

Sheesh

 

ByronM

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4129109 posted 3:55 pm on May 8, 2010 (gmt 0)

And?

I know MS uses twitter to do geo-location on bing maps as well as integrate within bing search.. i'm sure others do the same.

Whats with all the amazon hate? some popular stuff runs on amazon ec2 from my reddit addiction to my wife's four square habits :)

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4129109 posted 4:05 pm on May 8, 2010 (gmt 0)

just fyi: Amazon runs the backend of Twitter and bit.ly.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4129109 posted 4:21 pm on May 8, 2010 (gmt 0)

@ ByronM

[webmasterworld.com...]

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4129109 posted 5:27 pm on May 8, 2010 (gmt 0)

@ByronM The web has changed a lot since this discussion: [webmasterworld.com...] and many of the AWS maggots seem to be oblivious to the existence of robots.txt. Now when you run a small website, a few pages here or there is very little. But when you run a large website with thousands or millions of webpages and the operators of some of these maggots decide to download the entire site, it is a big problem.

Regards...jmcc

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4129109 posted 5:47 pm on May 8, 2010 (gmt 0)

Your first two to four visitors following any bit.ly or tiny.cc link that you post to Twitter will almost certainly be bots of some sort - often arriving within tens of seconds after posting the link.

londrum

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4129109 posted 7:18 pm on May 8, 2010 (gmt 0)

isn't that a good thing though? a big chunk of the webmasters who use twitter just automate it all anyway, i know i do. i tie my rss feeds to it. there's not much point doing that if the bots don't lap it up.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4129109 posted 7:49 pm on May 8, 2010 (gmt 0)

Most Twitter parasites I block by IP range, a couple by UA and I allow several that benefit me; just like every thing else online, it's a case by case thing.

blend27

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4129109 posted 12:22 am on May 9, 2010 (gmt 0)

174. and 173. ranges are pretty new, i dont mean AWS, but are on the *&$% list to start with. I am looking at 50+++ sites report at this point and it ian't pretty.

Sgt_Kickaxe

WebmasterWorld Senior Member sgt_kickaxe us a WebmasterWorld Top Contributor of All Time



 
Msg#: 4129109 posted 5:52 am on May 9, 2010 (gmt 0)

I christen these new findings "Recursive Twitter Disease", very dangerous to a website without a healthy immune system.

Question is, now that we've got a disease... is there a cure that DOESN'T involve cutting something off?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved