Forum Moderators: open
66.102.8.219 - - [12/Jul/2021:09:15:55 +0200] "GET [snip] HTTP/1.1" 200 6512 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.90 Mobile Safari/537.36 (compatible; Google-AMPHTML)" incoming links from Twitter. I see that it doesn't support JavaScript nor cookies, and ignores robots.txtWell, that's rude of it, especially when you consider that the Twitterbot does seem to honor robots.txt. Did you mean that it asks and doesn't honor, or doesn't ask in the first place? And, tangentially, have you met robots that do use cookies? I haven’t. (I have one set of pages that redirect smartphones if there isn't a cookie saying “Yes, yes, I’ve been here before and know what I’m doing”; the mobile Googlebot always gets redirected.)
AMPHTML does sound like it means AMP + HTML, doesn’t it.
Did you mean that it asks and doesn't honor, or doesn't ask in the first place?
And, tangentially, have you met robots that do use cookies?
Mozilla/5.0 (compatible) Feedfetcher-Google; (+http://www.google.com/feedfetcher.html)which I consider a bad_agent. Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko; googleweblight) Chrome/38.0.1025.166 Mobile Safari/535.19I block these under the designation botnet_agent Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4143.7 Mobile Safari/537.36 Chrome-LighthouseI used to block this UA, but currently don't. Worth noting that one of the requests was for /asset-manifest.json -- a file I do not have, never did have, and probably isn't intended for humans anyway. Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy) Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36 Google (+https://developers.google.com/+/web/snippet/)Blocked, probably for deficient headers. Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.0.7; Google-SearchByImage) Gecko/2009021910 Firefox/3.0.7These are all blocked, probably due to the Firefox/3. Mozilla/5.0 (compatible; GoogleDocs; documents; +http://docs.google.com) Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google FaviconThat includes most but not all the 301s (from HTTP to HTTPS), which is understandable because part of its job involves GSC. Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Web Preview) Chrome/89.0.4389.112 Safari/537.36 In this case, all I mean is that there's no comprehensive list of everything that uses the range 66.102.0.0/18, or what they use it for.