Forum Moderators: open
2018-08-22:13:36:43
URL: /
IP: 23.21.226.***
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en-US,*
Connection: Keep-Alive
Host: example.com
User-Agent: Mozilla/5.0 (compatible; DuckDuckBot-Https/1.1; https://duckduckgo.com/duckduckbot)
[edited by: keyplyr at 11:00 pm (utc) on Sep 1, 2018]
http://2.brf.be
(wtf?) as referer, and all front-page requests give In the last couple of days I have seen them from:Yes, scalable cloud hosting will use available distributed resources.
D’you suppose it’s any use asking them to stop sending a ### referer whose only possible effect is to get them blocked?It's the way you have your end set up*. If you would allow straight access to your robots.txt you wouldn't see that. And no, I don't think they care. If you want them to index your web properties correctly, give them access without issue.
albeit possibly misunderstoodEverybody gets robots.txt, even when I know perfectly well they just want to see if I've listed the names of specialized CMS directories that they otherwise wouldn't know about. It's in a <Files> envelope with Allow from all. (RewriteRules are constrained to filetypes, so no further hole-poking is needed.) That's not the issue. I'm thinking of the inexplicable referers the DDG faviconbot always sends--and who knows what kind of referer will come in with page requests. It had better not be a generic front-page referer, because I block those for all pages that are not, in fact, linked from the front page. The minor irony here is that everyone is allowed to get the favicon--but when the faviconbot was blocked in a page request, it wouldn't ask for the favicon even though that was its only reason for making the request in the first place.
I'm thinking of the inexplicable referers the DDG faviconbot always sendsWell that's a different UA than we're discussing. Yeah, the DDG favicon bot includes a referrer.
all robots.txt requests give http://2.brf.be (wtf?) as refererWell normally I'd say that the UA was spoofed and the referrer is log-spam. Easy to use the same AWS ranges & fake the headers to spoof DDG.