| 2:39 pm on Oct 8, 2010 (gmt 0)|
|The complaint alleges Google shares with third parties users’ search queries, including those that contain personal information |
Well, they do.
I can do a search on truck bed liners and the next day log in to my gmail account and see an abnormal amount of pick-up truck related spam. There was none before. Now it takes up a quarter of my inbox.
I don't believe much in coincidence. It's cause and effect. Proving it will be another story.
| 3:05 pm on Oct 8, 2010 (gmt 0)|
He is complaining about the passing on of search query data to websites and SEOs.
I am sure most people here would not want him to succeed forcing Google to hide search queries in refers.
I am not sure how that contains personal data either.
@wyweb. Sounds odd, and I have never noticed anything like that. It could be your ISP, or one of the sites you visited from Google, if any of them have your email address.
| 3:19 pm on Oct 8, 2010 (gmt 0)|
This is a ludacris complaint and will go nowhere. Referral data is hardly a Google invention.
| 3:39 pm on Oct 8, 2010 (gmt 0)|
It's not a ludicrous complaint, however it's unlikely to succeed.
I'm sure there are people that would prefer not to send this information and most are probably unaware that it is sent. Therefore, I think it would be reasonable to offer the option not to send this information. This could easily be achieved by processing search requests using the POST method instead of GET.
I know that webmasters won't like this idea, but that's life.
| 3:52 pm on Oct 8, 2010 (gmt 0)|
It's pretty useless information right now anyway. It doesn't matter what's being searched for when targeting keywords, all that matters is using the keywords Google thinks are related else you're not maximizing your SEO time. Keywords that seem related to you may not be most important to Google. That's if you believe in the notion that sites who cover keywords in full instead of in passing receive a better "authority" bump.
| 4:26 pm on Oct 8, 2010 (gmt 0)|
If he's talking about the HTTP_REFERER header, that's nothing to do with Google, that's just how HTTP works and the data will be passed by the browser, not by Google.
If he's talking about linking queries to a Google account and using them for targeted advertising, like wyweb suggests, then that's a different matter.
| 4:51 pm on Oct 8, 2010 (gmt 0)|
It is a ludicrous complaint - if you consider the referrer header to be an invasion of privacy, then as that header is sent by the browser and not the website, then the complaint should be aimed at Microsoft, Mozilla et al. (As I understand it, the complaint is talking aboutGoogle Search not Google Chrome).
Using POST would stop the ability of users to bookmark searches, so there is a significant downside to its adoption by Google or any other search engine. Should Google have to vreak its site to cover up for a supposed privacy issue caused by a third-party (the browser)?
| 5:19 pm on Oct 8, 2010 (gmt 0)|
Using POST would stop searches being bookmarked, but only if the user ticked the box that activates the option.
Since it is technically easy to implement, Google would be unwise to blame browsers - if I were the judge and they tried that I would consider them to be in contempt of court. Also it's not exactly difficult to explain this stuff to a judge - rocket science it ain't.
| 5:58 pm on Oct 8, 2010 (gmt 0)|
THe referrer header IS an invasion of privacy. I love it as much as anyone, but it's the most ridiculous concept ever. When I go car shopping and I visit my local guy, he doesn't know that I just came from his competitor. On the web, sharing of that data is automatic.
Why should I know what page you were looking at before you came to my site? It's none of my business.
It's not even immediately clear why technical people would've designed this into the web in the first place.
| 6:37 pm on Oct 8, 2010 (gmt 0)|
> It's not [...] clear why technical people would've designed this into the web in the first place.
Perhaps so that incorrect links can be traced to their source and corrected.
| 7:45 pm on Oct 8, 2010 (gmt 0)|
It's pretty much been established since the dawn of humanity: most people in power abuse their power. I don't trust Google.
| 11:28 pm on Oct 8, 2010 (gmt 0)|
With some complaint this hair-brained it's obvious the author knows nothing about the protocol whatsoever. It really sickens me to see people wrongfully jump on the wronged party here, that don't know where the blame actually lies, just because it's popular to Google-bash. I'm no fanboy of any company but at least have the common decency to put the blame where it actually exists this time: in the browser, in YOUR HANDS to control.
|THe referrer header IS an invasion of privacy |
How can something your browser broadcasts be in invasion?
An invasion is when someone is encroaching upon your rights which is hardly the case when your browser willingly broadcasts that information.
Simply stop broadcasting it.
It's your browser, you can either block it with a firewall, an add-on, or an option setting.
Besides, this has been going on since the beginning of the internet, nothing new, nothing about Google, or Yahoo, or Bing, or any SE for that matter, it's all about the BROWSER.
| 3:49 am on Oct 9, 2010 (gmt 0)|
|It's not even immediately clear why technical people would've designed this into the web in the first place. |
There was a (short) time the web was only about linking knowledge together. The referrer makes it possible to discover content related to yours.
| 12:43 pm on Oct 9, 2010 (gmt 0)|
|Simply stop broadcasting it. |
It's not anywhere near that simple, and you know it. Nobody knows how to do it, most people aren't even aware that it's being done.
| 4:02 pm on Oct 9, 2010 (gmt 0)|
|How can something your browser broadcasts be in invasion? |
Let us suppose that the first search engines (including Google) used POST rather than GET for some arbitrary reason. Let us further suppose that Google decided to switch to GET. Without a doubt, everyone here would be up in arms claiming this was just another sign of Google's contempt for privacy - and maybe you would agree with them.
Since it's easy, it is entirely reasonable that search engines offer the option to hide referrer data. If it was difficult, then the arguments would be different, but that's not the case - it's trivially simple.
| 1:13 pm on Oct 10, 2010 (gmt 0)|
|Mr. Soghoian’s complaint centers on the way the Internet handles links that users click on to surf. When a link is clicked, the address where the user came from is transmitted to the linked site via something called a “referrer header.” In the case of search queries, this address includes the entire text of the search, which may contain users’ personal information if, say, they search for their own name |
That is a downright argument. All the site linked to knows is that someone searched for a name, not that someone searched for their own name.
Also, if search engines should block, it follows that all websites should.
@wheel, there has never been any real demand for referrer blocking even among the people who know about it. The most popular Firefox referrer blocker extension gets 4,000 downloads a week, whereas no-script (also a somewhat geeky blocker) gets 263,000.
| 8:11 pm on Oct 10, 2010 (gmt 0)|
|Since it's easy, it is entirely reasonable that search engines offer the option to hide referrer data. |
That still doesn't give you privacy on the rest of the internet. Trying to pass the blame to individual websites when the entire internet does the same thing is silly. The genie is out of the bottle so the only way to universally fix the problem is to put the simple privacy option back in the hands of the users, and not waste time tilting at search engines or Google bashing just because it's popular.
Directories pass a lots of information as well because of the explicit URLs, regardless of using GET or POST, such as this link from DMOZ:
Directories also have search so clicking any link from most directories after a search will reveal your private search terms, even on DMOZ, BOTW, BUSINESS.COM, thousands more:
So on and so forth...
Maybe some sites use POST, my directory uses POST, but it still doesn't matter if you clicked through from a page with an explicit URL describing the content as the above example provided.
Even popular blog software like WORDPRESS shows the query in the URL, and there are hundreds of thousands of WORDPRESS blogs out there that will expose your search terms.
Why is it so hard to get across that the query string privacy issue is so large that the simple solution exists SOLELY in the browser?
Railing against all the sites that are sources of this so-called "privacy issue" is a silly waste of time. Sure you'll get Google or Bing to cave to pressure while thousands, maybe tens of thousands, maybe even millions of sites will continue to gleefully violate your privacy. Of course that won't matter because big bad Google will have solved their problem and now, without a real solution, everyone will happily bury their heads in the sand with the privacy issue (NOT!) solved.
The REAL QUESTION that people should be asking, and demanding an easy-to-access solution for, is why don't most BROWSERS give any easy way to disable it from passing the referrer?
When all the browsers have a simple checkbox to disable HTTP_REFER, only then do you have a real privacy solution to this particular problem.
Simplest solution, universal privacy, done.
|there has never been any real demand for referrer blocking even among the people who know about it. |
That's because they haven't seen what can be done with it yet and have no clue how much data those 3rd party universal tracking cookies from ad agencies can learn about an individual.
You can learn a lot from ANY referrer, not just ones with search terms, as I pointed out above.
| 8:04 am on Oct 11, 2010 (gmt 0)|
There is a vast difference between transmitting the page that the user arrived from (the original intention) and transmitting a list of keywords that the user entered.
Do you read every URL of every page you view - I don't think so.
Do you manually type in every search you use - I do think so.
The referrer header was never intended to transmit information entered by the user - please admit that much.
Incidentally, it would be technically easy to adjust browsers so that they can bookmark pages created using POST, so even that argument is not very strong. Also, I am not a Google-basher - this should be apparent from the extremely moderate language and statements that I have made.
| 9:31 am on Oct 11, 2010 (gmt 0)|
|it would be technically easy to adjust browsers so that they can bookmark pages created using POST |
in order for the browser to do this it would have to bookmark the request rather than the response.
| 10:28 pm on Oct 11, 2010 (gmt 0)|
I'm not sure what you mean. If I was asked to solve the problem of bookmarking pages created using POST, I would start by adjusting the bookmark editor and add a field for the POST data. After that it would simply be a case of ensuring it was filled in correctly and used correctly.
If I had written a browser myself, I would say this would amount to a morning's work but would expect to fit in elevenses.
In the afternoon, I might think about limiting the feature to certain sites i.e. a list of search engines, but make the list user-editable.
| 12:20 am on Oct 12, 2010 (gmt 0)|
|There is a vast difference between transmitting the page that the user arrived from (the original intention) and transmitting a list of keywords that the user entered. |
Not when the URL is SEO friendly like my example as it divulges lots of clues about what the user was doing without any search keywords. By collecting a bunch of URLs you can easily build a profile about the user. Therefore any transmission of the referrer, with or without keywords, is a violation of privacy if you really want privacy, that's all there is to it.
| 12:38 am on Oct 12, 2010 (gmt 0)|
|If I was asked to solve the problem of bookmarking pages created using POST, I would start by adjusting the bookmark editor and add a field for the POST data. |
the POST data does not exist on the page created by the post, it exists in the HTTP Request sent by the page containing the form.
therefore the only two places that have the POST data are the requesting/referring page at the moment the form is submitted and the server side script that is referred to by the action parameter of the form.
| 9:47 am on Oct 12, 2010 (gmt 0)|
|the POST data does not exist on the page created by the post |
This is getting silly - try hitting the refresh button whilst viewing a page resulting from POST. Depending what browser you are using you will probably be asked whether you wish to resend that data. In other words, browsers typically remember the POST data that was used to create the current page, therefore it is a trivial matter to save that data as part of a bookmark.
However, even if current browsers did not cache the POST data for the purpose page refreshes, it would be trivial to do so and thereby enable bookmarking.
A piece of advice - when someone tells you that solving a problem is trivial, instead of assuming they are stupid or ignorant, try to figure out how you would solve that problem - you might discover that it really is trivial after all.
Taking your example, it is entirely possible that a user could arrive at your page after searching for a small subset of the keywords stuffed into the URL, for example, Personal Family Relationships. Also, there could be several other keywords in the original search that only appear in the text not the url. Therefore, it does not follow that the breach of privacy (if there is such a breach) is as great as that of a search engine which supplies the absolute, precise list of keywords used with none missing and no extras.
You're over-reaching with your argument here!
There is actually a serious problem here with respect to UK law. Given that the IP address is also available and this can be used to identify the user (absolute certainty is not required under UK law) then under certain cirmstances, the search information could be considered private which then takes us into the realms of the Data Protection Act. For instance, it is easy to imagine that someone with cancer might search for certain specific keywords and some of those might be pretty obscure, the sort only used by a doctor to a patient (i.e. there would be a high probability that the user actually had cancer).
Now, since there is no obvious way to contact someone using only their IP address, such information might be of low value, but we do know that advertisers are always looking for new ways to promote goods and it is entirely possible that a list of IP addresses of people researching cancer would have some value even if the list had a short life because IP addresses change.
So, following this argument, it is possible that an agency other than Google could stuff subtle adverts onto pages promoting dubious drugs to vulnerable people. We all know that what I have described is technically possible even if it hasn't actually happened yet.
| 2:49 pm on Oct 12, 2010 (gmt 0)|
|You're over-reaching with your argument here! |
Call it whatever you want but it's currently being used in practice, by real tracking companies, not just conjecture. Not only that, if a non-SEO page name comes along like "newpage1.html" their bot grabs the page and looks for clues in the title, meta, h1 or content, maybe even hits SEO page names as well. The point is nobody could observe where you've been in any shape or form if it's not being promoted by the browser.
Does your car go down the street with a big sign "JUST BEEN TO A STRIP CLUB!"?
Your browser does depending on how you left that site!
Now tie this to 3rd party tracking cookies and IP tracking, and I can get a really good profile of someone, and suddenly specific ads are in your face... assuming my motives are tracking for advertising.
This kind of tracking is already happening, with or without keywords, and obviously without your knowledge!
Whether you deem it a threat is up to you.
I'd be happy with an OFF SWITCH to make it stop.
| 3:25 pm on Oct 12, 2010 (gmt 0)|
But, no matter how you cut it, you are comparing facts with guesswork, and no matter how clever the code is that is doing the guessing, it won't be as accurate as having the information handed over on a silver platter.
You are right, there is a potential for breach of privacy by use of referrer data from regular websites, however, simply because this might exist (or does exist) it does not let search engines off the hook. No matter how you look at it, by using POST, they could improve privacy of users. Personally, I think the case is marginal, but it should not be dismissed out of hand.
| 4:31 pm on Oct 12, 2010 (gmt 0)|
OK, so the SE's POST, what about those 10Ks directorys? 500Ks of WordPress sites? Wikis, tikis, all doing the same thing. You'll talk Google and Bing into make then change, which means all of our analytics go tell hell in a handbasket.
Fix the browser, that's the real game changer.
| 4:58 pm on Oct 12, 2010 (gmt 0)|
Yes, your analytics will be fried - I alluded to this in my first post. However, if by "fixing the browser" you mean switching off referrer data altogether, this would be even worse - you would not know what site a user arrived from, never mind what page.
So, my solution of fixing the search engines (by using POST) is better than "fixing the browser" from a webmaster point of view.
| 5:16 pm on Oct 12, 2010 (gmt 0)|
|So, my solution of fixing the search engines (by using POST) is better than "fixing the browser" from a webmaster point of view. |
Which doesn't completely fix the privacy aspect, the only private referrer is no referrer.
And only SEs using post doesn't fix directories, WP, etc. that does the same thing.
| 6:04 pm on Oct 12, 2010 (gmt 0)|
Well, if Google were to loose this case (or even pick up enough bad publicity) others would have to look at their practices as well. Otherwise, I think my suggestion is a reasonable compromise. However, if people want total referrer privacy, it's easy enough in Opera, and possible in Firefox, but I'm not sure about Internet Explorer.
| This 33 message thread spans 2 pages: 33 (  2 ) > > |