Forum Moderators: open
OKNo ReferrerPlease only post exact log entry, otherwise it is irrelevant for documentation.
IP: distributed-- 18, 34, 35, et cetera, clearly no longer only from 54 as in years past
UA: Mozilla/5.0 (compatible; Cliqzbot/2.0; +http://cliqz.com/company/cliqzbot)
https://cliqz.com/cliqzbot(Hee, hee, yet another HTTPS migration, among other things.) The page defaults to German, with a slightly pecular language-switching arrangement where you have to mouse-over the German flag in order to get the UK flag ... in a different place. And vice versa to change back. I must have been there before, because I remember the “Was genau ist Cliqzbot?”
I've never seen that in the UA stringThat's what I meant. It's three consecutive pieces: response, referer, UA. So probably not literal raw logs but some tabular output, probably with the intervening tabs (\t character) disappearing when pasting into the Post form.
en-US, en;q=0.8, es-US;q=0.5
(I like “es-US”) but on some days for variety's sake--including but not limited to material on Kalaallisut and related topics--it's de-DE, de;q=0.9, dsb-DE;q=0.3, hsb-DE;q=0.2
(Since when is Hue-Saturation-Brightness a language?) Sorbian, or Wendisch, is a member of the West Slavic subgroup of Indo-European languges spoken by about 55,000 people in Upper and Lower Lusatia in the German Länder of Saxony and Brandenburg. The Sorbs are descendents of the Wends, the German name for the Slavic tribes who occupied the area between the Elbe and Saale rivers in the west and the Odra (Oder) River in the east during the medieval period.
fr-FR, fr;q=0.9, br-FR;q=0.7, gsw-FR;q=0.5, co-FR;q=0.3
br is presumably Breton Corsican has no official status in CorsicaIn all cases, the robots.txt request matches the language headers of the immediately following page requests.
In February 2017, Cliqz acquired the world’s leading anti-tracking tool Ghostery.
could it be a part or pre-fetch from GhosteryHm, that's an idea. But then you'd expect language headers varying all over the map, depending on their most recent human request. It definitely isn't a “pre-fetch” in the sense of something followed immediately afterward by a human request.
Cliqz offers products for searching directly in the browser and runs a self-developed search technology. Cliqzbot collects URLs and website content in the Cliqz index.source: [cliqz.com...]