It grabbed robots.txt and then my landing page index.html. Nothing else.
Pfui
2:24 pm on Nov 17, 2022 (gmt 0)
Had 12 hits from Twitterbot/1.0 yesterday -- all numeric neighbors of r-199-16-157-18x.twttr.com (Atlanta) -- in six visits, about the daily norm. Each time it fetched robots.txt and the same .html plus one other, neither of which are my landing page. Twitter's allowed to access ~95% of the non-.cgi site but rarely visits (probably because I rarely visit them/plug my site there).
SumGuy
12:30 am on Nov 18, 2022 (gmt 0)
I have zero interaction with twitter and less than zero interaction that would in any way link my site to twitter. So I'm curious what the purpose of the twitterbot is. How does it's web-crawling activity feed back into anything that would be seen by a user or a viewer of twitter content?
lucy24
6:41 pm on Nov 18, 2022 (gmt 0)
Huh. I've always assumed that someone tweets a link, leading Twitterbot to check up on the page in question, preceded by robots.txt. I've got the real ones marked as 192.133.76.0/22 199.16.156.0/22 199.59.148.0/22 but there seem to be lots of fakers. And many of those are fixated on the various kinds of icon, which often falls under No Skin Off My Nose.
Pfui
8:35 pm on Nov 18, 2022 (gmt 0)
Speaking of Twitterbot UA fakers, surprise-surprise... ec2-52-212-113-64.eu-west-1.compute.amazonaws.com ec2-52-211-204-34.eu-west-1.compute.amazonaws.com ec2-63-35-227-56.eu-west-1.compute.amazonaws.com (etc.)