Forum Moderators: open

Message Too Old, No Replies

UptimeBot

         

lucy24

11:12 pm on Mar 10, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Forums search tells me it's been around forever-- the only recent reference is in a tangential discussion from last year [webmasterworld.com] --but for some reason it has only just taken an interest in my site. Or rather, sites, plural, because it's also been snuffling around my test site and typo domains.
IP: 45.79.81.abc, 45.79.89.abc
Method: HEAD
Page: / only
Referer: http://uptime.com/www.example.com
UA: Mozilla/5.0 (compatible; Uptimebot/1.0; +http://www.uptime.com/uptimebot)
I've got all of 45.79 marked as Linode; never bothered looking up which exact bits belong to uptime. The "example.com" in the referer is your own site name-- with or without www depending on which form they requested. (This assumption is based on the fact that requests with the "wrong" form of a given name in the referer slot always get a 301.)

This is another of those irritating bots where it probably makes no difference whether you block them* or not, because all they're interested in is getting a response from your site. So it shouldn't make any difference whether that response is a 200 or a 403, so long as it isn't 500-class.


* Closer study of request tells me they're blocked on three separate grounds.

LifeinAsia

11:51 pm on Mar 10, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It's a free uptime monitoring site- periodically checks your site for a response and sends you an e-mail/text notification if your site is "down." Unless you signed up for the service (or someone else signed up with your domain for some reason), they shouldn't be hitting you.

If someone else signed up with your domain, and you're blocking them, then that person is probably getting notifications their site is down. So it serves them right. :)

lucy24

12:25 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Unless you signed up for the service (or someone else signed up with your domain for some reason), they shouldn't be hitting you.

Yeah, that's what I gathered from last year's discussion. If it were a one-off, I'd assume it was me trying out someone else's robot and promptly forgetting I'd done so-- but this ongoing multi-site fascination is inexplicable. Urk. That’s a Sinatra song, isn’t it?

not2easy

1:14 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



First time they've visited mine also. same UA and neighborhood. No, I didn't hire them.
45.79.89.nnn - - [29/Feb/2016:11:16:54 -0600] 
"HEAD / HTTP/1.1" 301 -
"http://uptime.com/example.com"
"Mozilla/5.0 (compatible; Uptimebot/1.0; +http://www.uptime.com/uptimebot)"

keyplyr

1:16 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



...multi-site fascination is inexplicable. Urk. That’s a Sinatra song, isn’t it?
[rant]
No one remembers who wrote the tune, putting pen to paper in some lonely place, they only remember who had the hit. Yup, you're likely referring to the Sintra recording of the Jimmy Van Heusen & Sammy Cahn tune 'Call Me Irresponsible.'
[/rant]

Now... about the bot :)
I've had it disallowed in robots.txt for years. So far, I've not seen them request any other files:
User-agent: Uptimebot
Disallow: /

But... isn't Uptimebot actually a Billy Joel tune?

not2easy

1:35 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



They did not request robots.txt, this was the first and only visit I've had (so far) just this one request.

lucy24

1:37 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've had then disallowed in robots.txt for years.

Oh, gosh. I never think about robots.txt, even while I'm always telling people to include a robots.txt exclusion for the heck of it. Well, except for the two robots I'm currently evaluating for hole-poking, based on how they respond in the long term to a robots.txt bar.

In the specific case of the Uptimebot, though, I'm not sure how effective a robots.txt rule would be, seeing as how they don't ever seem to have asked for robots.txt.

No one remembers who wrote the tune

Heh, that's funny, I may really have been thinking of Fascinating Rhythm. They all run together after a while.

:: detour to YouTube ::

Well, it's always comforting to see that it is possible to put polysyllabic words into a song while conveying a message more sophisticated than "Look at me! I can pronounce words with more than four letters!"

keyplyr

2:00 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



They did not request robots.txt, this was the first and only visit I've had (so far) just this one request.
They might get it from an unidentified UA, even from a different IP address. All I can tell you is, in 18 years, I 've had Uptimebot roboted & it has never once requested any files.

I remember specifically because when I first moved my site off the University of California servers in 1998, I had an unreliable hosting company and I used Uptimebot to monitor my site as proof of poor server function... then I couldn't get rid of it, so I added it to robots. No problems since.

LifeinAsia

2:15 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



In the specific case of the Uptimebot, though, I'm not sure how effective a robots.txt rule would be, seeing as how they don't ever seem to have asked for robots.txt.

If it's really the UptimeRobot, there isn't any reason for it to ask for robots.txt- it should only be hitting the specific URL you (or whoever signed up) specifies to hit.

I actually use them as an another health check in addition to our internal monitoring.
:: pulling a Lucy and detouring to my log files ::
Interesting- their bot hits me from 69.162.124.abc. I'm not sure if they poll from multiple locations.
Also, their UA is Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)
So we're not talking about the same beast.

[edited by: keyplyr at 3:56 am (utc) on Mar 11, 2016]
[edit reason] delinked URL in UA [/edit]

lucy24

5:21 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, I went back and checked. Where I said "81.abc" and "89.abc" above, it's actually two exact IPs, over and over:
45.79.89.188
45.79.81.142
Just those two. But they seem to be random; it's not like those search engines that seem to allocate a specific IP to each domain. And whois dot domaintools just says Linode, so that's a dead end.

Uptime's www page says
Mozilla/5.0 (compatible; Uptimebot/1.0; +http://www.uptime.com/uptimebot)
exactly as observed, but is silent on the topic of IP ranges. They also say nothing about robots.txt; they're in "submit a request" territory. keyplyr, is it possible that back in 1998-- when everything was smaller-- they put you on a "never crawl this domain again until the heat-death of the universe" list, and you've been on it ever since, and robots.txt is just a red herring?

keyplyr

5:43 am on Mar 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



keyplyr, is it possible that back in 1998-- when everything was smaller-- they put you on a "never crawl this domain again until the heat-death of the universe" list, and you've been on it ever since, and robots.txt is just a red herring?
yes

grouchy sysadmin

1:13 am on Mar 12, 2016 (gmt 0)

10+ Year Member



I am seeing the same traffic. For instance,

example1.ext 45.79.81.142 - - [01/Mar/2016:11:27:40 +0000] "HEAD / HTTP/1.1" 200 0 "http://uptime.com/example1.ext" "Mozilla/5.0 (compatible; Uptimebot/1.0; +http://www.uptime.com/uptimebot)" "-"0.285- MISS

But then I am also seeing these variants.

example2.ext 52.91.7.165 - - [02/Mar/2016:18:23:38 +0000] "GET / HTTP/1.1" 301 178 "-" "\x22Uptime.com_(http://uptime.com/)\x22" "-"0.000- -
example3.net 52.91.7.165 - - [02/Mar/2016:18:25:09 +0000] "GET / HTTP/1.1" 301 5 "-" "\x22Uptime.com_(http://uptime.com/)\x22" "-"0.523- MISS

Different IP, different agent and http type. Is anybody else see the latter example?

Edit: The second variant is them. I thought for a moment it might have been somebody else masquerading, but running a test at uptime.com shows that exact GET request and user-agent.

lucy24

2:48 am on Mar 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



\x22Uptime.com_(http://uptime.com/)\x22

Ordinarily I'd consider that the benchmark of an unintelligent robot, getting confused about their file encoding after too much copy-and-paste work from one script to another. Convert space to lowline, escape all non-alphanumerics, garble your quotation marks (%22) ...

keyplyr

3:02 am on Mar 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you consider the big smelly elephant sitting on the sofa probably has a banner across its butt saying "these bots are fake."

I just sent the real Uptimebot (from their site) to a customer's test site and the UA was indeed:
Mozilla/5.0 (compatible; Uptimebot/1.0; +http://www.uptime.com/uptimebot)
Host: AWS
52.64.0.0 - 52.79.255.255
52.64.0.0/12

It requested one file named robots.txt, where it is disallowed. Of course it may come back with whatever (check from 30 locations?) but so far I am inclined to go with the obvious.

lucy24

5:42 am on Mar 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



a banner across its butt saying "these bots are fake."

Pay no attention to that robot behind the curtain.

so far I am inclined to go with the obvious.

And come to think of it, if all they're testing is whether the site is physically accessible, requesting robots.txt should do just as well as anything, shouldn't it?

:: trying to think of some convoluted scenario in which front page and robots.txt live on different servers and it's possible for the root to be unintentionally offline while still successfully proxying to the robots.txt server ::

If you look up the (real) uptimebot's exact IP, does it come back as uptime?

keyplyr

5:48 am on Mar 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ahhh... found this a few moments later performing HEAD check:
\"Uptime.com_(http://uptime.com/)\"
(These are quotes inside the server report quotes)

Host: AWS
52.84.0.0 - 52.95.255.255
52.88.0.0/13, 52.84.0.0/14

I guess from one of its "30 locations"

Didn't actually sign-up, just did the freebie so maybe this is all I get. The 2 different UAs might be an effort to bypass blocks put up over the years. Seems to me that people were getting quite annoyed with these guys at one point.

So I didn't see them for years until I actually requested the visit.

System

9:32 pm on Mar 13, 2016 (gmt 0)

redhat



The following 4 messages were cut out to new thread by keyplyr. New thread at: search_engine_spiders/4795872.htm [webmasterworld.com]
7:11 pm on Mar 14, 2016 (UTC -8)

[edited by: keyplyr at 3:33 am (utc) on Mar 15, 2016]
[edit reason] remove excelsior [/edit]

lucy24

12:15 am on Apr 8, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Now, on the positive side...

The Uptimebot that I've been seeing for the past month-and-a-bit
45.79.89.188
45.79.81.142
(those two exact IPs) suddenly started asking for robots.txt near the end of March ... and, finding itself denied, has made no other requests. So that's something.

Webwork

5:24 pm on Apr 12, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I imagine their data may (is) also of value to the domain-drop-catch industry, i.e., downtime as a precursor to a domain drop . . or better . . an imminent drop reporting service for those who don't care to wait for the drop, as in ~ "Pick up the phone and call NOW!".

A prototype "two bangs for the buck bot".

senoner

7:12 pm on Apr 21, 2016 (gmt 0)

10+ Year Member



On my site this robot gives "http://uptimebot.net/mysite.com" as referer.
Before 29 February 2016 "another" bot from the same network (*.members.linode.com) made similar requests using different UA-strings but a similar referer: "http://uptimechecker.com/mysite.com".
Around 25 December 2015 they used the referer "http://uptimebot.net/mysite.com"
And up till 22 December 2015 similar requests always from the same network and using different UA-strings hit my site and the referer was "http://scripted.com/mysite.com"

It looks like they are constantly changing referer-urls.

senoner

7:19 pm on Apr 21, 2016 (gmt 0)

10+ Year Member



The first and only time this bot checked robots txt was on 24 March 2016

lucy24

12:56 am on Apr 22, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Huh. With me it started on 31 March-- coincidentally just a few days after I denied them-- and then
:: shuffling papers ::
April 4 (twice), 6, 7, 10, 11, 13 and so far that's the last I've seen of them. I make that seven, no, eight robots.txt requests on one site, with not so much as a HEAD following. So maybe I can poke a hole, though I'm still darned if I know what they're doing here. Maybe they're testing the server itself, since the only connection among the assorted typo domains is that they all live in the same place. (Free lookup says there are currently 38 sites on this server, most of which aren't mine.)

:: detour to other logs, which takes longer ::
Possibly they did start running a new script in late March; I find robots.txt requests on 24 March for several sites, and one as early as the 23d.