Forum Moderators: open

Message Too Old, No Replies

Amazonbot

         

lucy24

9:01 pm on Oct 24, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does anyone have any closer acquaintance with this robot?
UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)
IP: various AWS (to date: 3.224, 52.70, 52.91; 54.89)
From: amazonbot@amazon.com
robots.txt: asks, may be compliant

Eagle-eyed readers will note the semicolon in the IP list. That's because one of its (to date) four visits had a--ahem, cough-cough--slightly modified UA with matching From: header:
userAgent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)
From: userAgentFrom=amazonbot@amazon.com

This in turn throws suspicion on all the others, though the URL in the UA does make it look legitimate. For a given definition of “legitimate”, given its place of origin.

It first showed its face on two long visits in April and May of this year. I think this was during my computerless period, which explains why I didn’t notice. In consequence, it didn’t find its name in robots.txt, so I can’t say whether it would have been compliant. (By default, it is blocked on various header-and-IP grounds.)

But wait! The plot thickens. On its most recent visit, a few days ago, there are a total of six requests:
HTTP:
11:58:00 robots.txt from 3.224
11:58:00 root / (blocked) from 3.224
11:58:58 robots.txt (and nothing else) from 52.70
HTTPS:
11:58:15 robots.txt (and nothing else) from 52.70
11:58:32 robots.txt from 3.224
11:58:32 root / (blocked) from 3.224

As it happens, 52.anything sets the bad_range environmental variable, which in turn leads to a minimalist robots.txt where everything is disallowed. (This rule wasn’t in effect in April/May.) But 3.anything currently doesn't, meaning that the 3.224 robots.txt request got the version that lists disallowed visitors by name, and Amazonbot is not (yet) on that list.

Why did it visit twice, from two different IPs?

Hmmm.

not2easy

9:39 pm on Oct 24, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Why did it visit twice, from two different IPs?

Amazon Echo?

sorry. I do believe they may have sparse pickings if they send out a robot from their own IPs.