Forum Moderators: open

Message Too Old, No Replies

Brave Search User Agent

         

brotherhood of LAN

6:03 pm on Mar 23, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This forum has a long record of identifying intent wrt UAs, so here's an interesting one hopefully.

Brave Search claims to crawl 40M URLs a day or (14 billion pages in yearly money) so should be plentiful in our logs.

[reddit.com...]

It seems there's no unique user agent. From what I can gather, they may be using a Chrome user agent and crawling from 5.116/16 range

Can anyone confirm a brave like UA, or perhaps something from the 5.116/16 range that is indexed in search.brave.com?

lucy24

7:48 pm on Mar 23, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: detour to raw logs ::

All I can say is that

(a) since the beginning of 2021, I have not met anything containing the string [Bb]rave *

(b) I tend to doubt 5.116, since all I see from there is a scant handful of obvious robots (the kind who ask for wp-admin, or who claim to be MSIE 6).

:: further research ::

Aha! [sistrix.com]
This was made possible with technology developed in Germany. As a result of the turmoil of the Corona crisis. Brave took over Cliqz, a shareholding of Hubert Burda Media, and Mozilla. The new Brave Search engine is based on this.

Now, Cliqz I do know. I must have authorized them at some time, because htaccess says !bad_range (meaning that they come from assorted server farms that are generally home to bad actors). So the question is whether there has been any change in Cliqzbot activity in recent months.

:: further delving into logs, only to find that Cliqzbot has not shown its face since July of 2020 ::

Well, I guess that's a red herring. They may be using Cliqz code, but not its UA. (And why do I know the name sistrix? I don't find it in any of the usual places.)


* Formally (?<!iolanthe-)\b[Bb]rave, as it turns out I have a lone file with the element "brave" in its name.

brotherhood of LAN

8:10 pm on Mar 23, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks lucy24, yes I think that's a definite avenue of enquiry.

I think it's boiling down to two things
1) Does Brave search engine have a uniquely identifiable UA? Their staff say they respect robots.txt but have never published a UA, so seems like a bit of an enigma
2) Otherwise, they're crawling from somewhere so where from? Anecdotally, from the 5.116 range

That's all I know just now but interested to know more.

dstiles

9:03 am on Mar 24, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I haven't seen anything brave-ish either, and I can confirm Lucy's (non-)sightings of cliqzbot. I seem to recall something about them following URLs browsed by their browser and stuffing those into their SE.

sudo

1:08 am on Mar 26, 2022 (gmt 0)



I suspect that they're somehow piggybacking off the results people get when using Google, Bing, or another search engine instead of theirs. Like Lucy I've yet to come across anything resembling a Brave UA...which leads me to believe that the obvious is true...they don't have one...yet.