Welcome to WebmasterWorld Guest from 3.228.24.192

Forum Moderators: Ocean10000

Message Too Old, No Replies

taptubot

     
4:45 pm on Oct 1, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2065
votes: 2


ec2-174-129-158-130.compute-1.amazonaws.com
taptubot *** please read http[code][/code]://www.taptu.com/corp/taptubot ***

robots.txt? Yes

(Also noted in: "amazonaws.com plays host to wide variety of bad bots [webmasterworld.com]")

[edited by: Ocean10000 at 5:32 pm (utc) on Oct. 1, 2009]
[edit reason] Breaking Hyperlink [/edit]

1:26 am on Oct 2, 2009 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2009
posts:17
votes: 0


I received a visit from "taptubot" this afternoon (requested and obeyed robots.txt) and promptly clicked their referrer link, however I was returned the following error:

The XML page cannot be displayed 
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
--------------------------------------------------------------------------------
The server did not understand the request, or the request was invalid. Error processing resource 'http://www.w3.org/TR/xhtm...

Here's the log with referrer:

75.101.192.195 - - [01/Oct/2009:07:10:09 -0600] "GET /robots.txt HTTP/1.0" 200 397 "-" "taptubot *** please read [taptu.com...] ***"

Then, this evening, I received new friends, apparently related to the above:

75.101.136.108 - - [01/Oct/2009:11:52:59 -0600] "GET //admin/ HTTP/1.1" 403 3077 "-" "Mozilla Firefox"
75.101.136.108 - - [01/Oct/2009:11:52:59 -0600] "POST //admin/record_company.php/password_forgotten.php?action=insert HTTP/1.1" 403 3522 "-" "Mozilla Firefox"
75.101.136.108 - - [01/Oct/2009:11:53:01 -0600] "GET //images/b6f04.php?cmd=uptime HTTP/1.0" 403 2985 "-" "lwp-trivial/1.41"

75.101.136.108 - - [01/Oct/2009:18:27:45 -0600] "GET //admin/ HTTP/1.1" 403 3077 "-" "Mozilla Firefox"
75.101.136.108 - - [01/Oct/2009:18:27:45 -0600] "POST//admin/record_company.php/password_forgotten.php?action=insert HTTP/1.1" 403 3522 "-" "Mozilla Firefox"
75.101.136.108 - - [01/Oct/2009:18:27:48 -0600] "GET //images/86032.php?cmd=uptime HTTP/1.0" 403 2985 "-" "lwp-trivial/1.41"

I'm guessing this is yet another Zen Cart exploit scanner, but at this time, Milw0rm is down or otherwise unreachable for me.

I'm wondering if there's a valid corrolation between these three visits? The first seemingly driving by to see if the lights are on; the next two ringing my doorbell to see who's home.

4:30 am on Oct 2, 2009 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:July 23, 2004
posts:596
votes: 103


I've got a feeling that we are going to be seeing more and more of these agents as the iphone and blackberry types of industries become more commonplace.

We've been building pages for people that want certain portions of their domain to view properly on wireless phones for a good bit of time now.

Nearly every time I've seen this bot, it comes in looking for things like mobi, m, and pda first.

11:53 am on Oct 2, 2009 (gmt 0)

New User

10+ Year Member

joined:Sept 13, 2009
posts: 20
votes: 0


In my Oct. 1st log files, I also found this UA:

taptubot *** please read [taptu.com...] ***

I tried the URL and got the same message; 'unable to read. . . . . ', etc.

Then the UA switched to: Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A543a Safari/419.3 and it kept hammering my site, but with no query string. Just:

GET /m/ HTTP/1.0
GET /mobile/ HTTP/1.0
GET /mobi/ HTTP/1.0
GET /iphone/ HTTP/1.0
GET /pda/ HTTP/1.0

All from 72.44.42.161 ~ Duncansville, PA

Not sure what to do with it.

8:42 pm on Oct 2, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 16, 2007
posts:846
votes: 0


Coming from ec2-67-202-63-nnn.compute-1.amazonaws.com and other EC ranges...

Time to block EC2 junk at the firewall, don't even want to see it in logs anymore.

6:47 pm on Oct 3, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2065
votes: 2


I was able to view the bot's link on Oct. 1 and today. It's actually one of the more readable, informative bot-info pages around.

@OnThePike
@JimmieT
I'm confused about how this or any bot-runner's host server could relay back to you/your server/IP(s). Do you read your logs online via a program on your server? Or did you click on links in your server's log file? Or--?

(FWIW, I never, ever click log-based links so there's no referer and thus no way a bot-runner could link itself to my server/IPs. If I even think about visiting a bot's link, I run it through Google first to learn what I can about it, and/or read the page without going to the site. If things look more okay than not, then I copy-paste the URL into my browser.)

@caribguy
If any of your sites rely on Twitter traffic... A ton of Twitter-related apps/hosts hail from amazonaws.com. They usually do not ID themselves as Twitter-anything but they're definitely tracking URLs in tweets because every time a site/page gets mentioned, we're swarmed.

(Not coincidentally, amazonaws and most of the simultaneous tweet-tracking hosts are already blocked as bad bot havens.)

9:38 pm on Oct 3, 2009 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


...amazonaws and most of the simultaneous tweet-tracking hosts are already blocked as bad bot havens - Pfui

I have considered blocking AWS, several times in fact. But every time I do the math, AWS always shows more legit traffic than problems. Example, Twitter is in my top 10 referring sites now, to which almost all these utilities, APs, etc reside on AWS. They crawl, but they also send traffic. Likewise, my mobile traffic has increased 4Xs this year and again, a significant amount of these APIs use AWS.
10:46 pm on Oct 3, 2009 (gmt 0)

New User

10+ Year Member

joined:Sept 13, 2009
posts: 20
votes: 0


@Pfui

I'm confused about how this or any bot-runner's host server could relay back to you/your server/IP(s). Do you read your logs online via a program on your server? Or did you click on links in your server's log file? Or--?

------------------------------------------------------------

My web site host server provides a zipped daily log. I download it via FTP, expand it, import it into Excel, and sort the data by Referrer.

I can then read the logs, do reverse lookup if needed, and decide whether to modify .htaccess if needed. I can copy/paste the bot URL into my home browser to find out more about it. My host IP will not be affected.

My site is strictly informational. There is no need to keep trying to ĎGETí what I donít have, yet, even after all the redirects or 403s, they continue. The referrer was Amazonaws, in this case. I think eventually they will give up and go away.

Jim

5:19 pm on Oct 4, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2065
votes: 2


@JimmieT:

Thanks for your process. I think I misunderstood your original post:)

Oh, and re amazonaws as host (never seen it as Referer) -- I've been watching that host for months now because the service is a haven for bad bots and iffy UAs, annoyingly undeterred by 403s. Expect more.

@keyplyr

Just musing here but from the looks of AWS's Twitter fellow travelers, 403s don't seem to impact subsequent, apparently personal hits. A while back, I followed up on the some of the bots' sites and our tweeted links appeared in their repackaged info.

Next time I see another swarm while it's happening, I'll re-check.