joined:Dec 1, 2011
I have seen similar issues with Google visits on sites that should NEVER see a visit. Such as sites I use only for testing web-site setups and testing of my software. Links that should never be seen other than by me.
That said, I don't really believe stuff like the ipullrank article, that "Chrome is GoogleBot".
But stuff like this is one reason that I decided not to use Chrome, after my initial testing of it. There simply is a limit to how much information I want Google to have.
Here is my theory:
Rather than Google directly "spying" on people, I think we have here another of Google's VERY liberal interpretations of what privacy is, combined with a VERY liberal use of information in Google's various databases.
When you use Chrome, the typical setup is to have its "Under Hood" features enabled.
Such as "Use a web-service to resolve navigation", "Use a prediction service to complete searches", and "Enable Phishing and malware protection".
All these services make a very liberal use of various services at Google.
Try loading up a sniffer, such as Network Monitor while using Chrome. Ignore the initial start up at first, doing update checks and other junk connections, but then start tracking the connections and data. Chrome has a CONSTANT barrage of connections to various Google IPs, even when you do just something simple, like typing in and opening a web-page..
Some connections are obvious, such as content checks and other, Google ad calls if the site has Adsense, Analytics calls, and such. A large portion are encrypted connections. (Obviously explained to "protect your privacy"), but also making it hard to know what exactly they are doing.
But rather than Google Chrome in an obvious, planned way spying on the individual user, I think they do what they have explained many times on the Google support groups, when people ask how stuff is found. That they get links from many sources, and that robots.txt only prevents links from being followed by GoogleBot, not from being found if they are available in other sources.
When Chrome does all its various checks on malware, completions, and much else, all these actions are obviously tracked and logged in Google's databases. If a Google ad is there, the URL is tracked. If you have Google Analytics code loading on the page, that URL is loaded into the Google databases.
You load a URI, and Chrome + Analytics, + Adsense tracks information back to Mamma Google doing all sorts of checks and logging.
It is my firm belief that all the links they "catch" through these sources are by Google seen as a way to include the "whole internet" in their databases. Seen as merely Google using all available sources.
They extract the links from the databases, they know it loaded up for you (or adsense/analytics/malware checker/predictor) as a valid link, and they can then pass it into GoogleBots queues for further follow-up.
If Gmail does these kinds of content checks, the link is suddenly logged in the Google databases as well. (Could happen both when sending an email with a link, and on checks at the receivers side, if Gmail.
Not overt spying, provided that they do not log your identity with the links, but still a somewhat grey way of getting information in my book.
The article links questioning why else Google would invest in getting into the browser market are likely incorrect that "Chrome is GoogleBot", but I do believe that Google got into that market because our browsers would otherwise be a HUGE untapped source of information for them.
Also, have y'all disabled your Google Web history yet in Google accounts? That is another potential source for Google database merges to find a lot of links that would otherwise not be found anywhere else.