Welcome to WebmasterWorld Guest from 34.238.194.166

Forum Moderators: goodroi

Google ReCaptcha v3: invisible data slurping shark

     
6:46 pm on Jun 27, 2019 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Nov 25, 2003
posts:1336
votes: 429


Once again Google wants to slurp even more of your visitor data, yet again under the guise of a free valuable service...
* Google’s new reCAPTCHA has a dark side [fastcompany.com], 27-June-2019.


The latest version of the bot detector reCaptcha is invisible to users
...
We’ve all tried to log into a website or submit a form only to be stuck clicking boxes of traffic lights or storefronts or bridges in a desperate attempt to finally convince the computer that we’re not actually a bot.
...
But last fall, Google launched a new version of the tool, with the goal of eliminating that annoying user experience entirely.
...
“You have to understand what behavior on the site should be and mimic that well enough to fool us,” he says. “That’s a really hard problem versus the general problem of, ‘Pretend like I’m a human.'” Website administrators then get access to their visitors’ risk scores and can decide how to handle them
...
According to two security researchers who’ve studied reCaptcha, one of the ways that Google determines whether you’re a malicious user or not is whether you already have a Google cookie installed on your browser.
...
To make this risk-score system work accurately, website administrators are supposed to embed reCaptcha v3 code on all of the pages of their website, not just on forms or log-in pages. Then, reCaptcha learns over time how their website’s users typically act, helping the machine learning algorithm underlying it to generate more accurate risk scores. Because reCaptcha v3 is likely to be on every page of a website, if you’re signed into your Google account there’s a chance Google is getting data about every single webpage you go to that is embedded with reCaptcha v3—and there many be no visual indication on the site that it’s happening, beyond a small reCaptcha logo hidden in the corner.

This is a type of bot behaviour detection that a few of us have been working on/doing for several years, and it can work very well. In my instance I'm using Redis streams to watch, identify, and act on data in real time while simultaneously saving data in Postgres for later/ongoing machine learning analysis results of which are fed back into the real time engine.

I'm quite certain that Google is far better at this than I, however I'm able to get the benefit without giving up my visitor data. There have been conversations about this new Google 'feature' for some months now and it has been brought up that reCaptcha v3 shifts legal responsibility from Google to the site - reCaptcha v1, v2 whether one 'failed' or was passed through was Google's decision; with v3 Google just provides a score and the site decides. Fastco noted this as well:

Google did not address any potential privacy problems and insisted that reCaptcha v3 is a matter of corporate responsibility.


And as usual Google totally can't get their story straight on what they are doing, will or may do with data collected... Ah, poor Google has so many heads it's hard to keep the PR story straight... The telephone aka Chinese whispers game defeats tech behemoth yet again!
12:43 am on June 28, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:July 29, 2007
posts:2011
votes: 211


NOBODY should be surprised by this, Google is the company that recorded everyone's wifi, incl email and such when it found a wifi not sufficiently protected, when running the Google maps cars. Google does NOT currently do anything that doesn't give them access to mass private data, it's what they crave.

When you log out of google, like out of your email or adsense account, did you notice that Google passes you through several URLs before you're logged out? Then, naturally, you might type in another URL to go to your next site, have you ever noticed that in Firefox this triggers an immediate and repetitive reloading of the login box over and over and over until the new page appears? Google is recording the site you go to after leaving their page.

Anyway, as for Captcha, it's not written to be fool-proof, it's written to gather data. Right now if you, for example, write a forum post, hit publish, and then get the google captcha you can completely bypass it. In Firefox if you have Google.com blocked the captcha still fires but you can immediately turn off your blocker and reload the page and hit publish again, the captcha doesn't fire and the post is made.

There's groups out there discussing dozens of ways that captcha can be bypassed with their bot scripts and for years Google hasn't fixed it. This company is quickly losing favor still won't knock it off.

[edited by: JS_Harris at 12:56 am (utc) on Jun 28, 2019]

12:52 am on June 28, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:July 29, 2007
posts:2011
votes: 211


Also, this risk score business being assigned to known humans and not just bots, it's disturbing. A little like China and their social credit system imo...
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members