homepage Welcome to WebmasterWorld Guest from 54.227.41.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 54 message thread spans 2 pages: < < 54 ( 1 [2]     
Chrome & Analytics Data Use... and Google conspiracy theories
TheMadScientist




msg:4535595
 3:46 am on Jan 13, 2013 (gmt 0)

Let's see if I can end the 'they're lying to us conspiracy theory arguments' or at least refute them wholly to those who can 'get' what I'm saying...

If you run a search engine the size of Google, with the goals of 'determining the one right answer' for people and 'organizing the world's information', you Would Not use Chrome and Analytics data directly in the algo's you wrote.

Why? Because the web is representative of 'the whole', not Chrome (even though it's widely used) and not Analytics (even though it's widely used), so what would happen if you used that data directly is you would not have large enough data samples from Every Site to actually rank Every Site and organize Every Site.

The way you could and likely would use the data is to compare the sites, time on them, visitor habits, etc. to the algo's you wrote and applied to the web as a whole, because if your goals were really to provide people with 'one right answer' and 'organize the world's information' for them, everything you do algorithmically has to apply to the Whole of the web, not the (even though widely used) limited data sets available via Chrome and Analytics.

IOW: You would use them to 'check your work' against reality and see if you're 'hitting the right sites' when people block them, or 'promoting the right sites' according to the ones people visit the most frequently, but you would Not base your rankings directly on their data, because even though there's a ton of it, neither 'scale' to Every Page on Every Site and Every Visitor, but your algorithms have to.

 

TheMadScientist




msg:4535771
 10:26 pm on Jan 13, 2013 (gmt 0)

Try this again, because I think I can say it better:

Look at the behavior signals in the results.

Look at the behavior of searchers after they left the results on sites where you have access to the data.

Put the two together to find the behavior signals in the results that indicate 'positive off site experience' via external sources.

Write an algo that increases the rankings of 'positive off site experience signals' based on 'visitor activity in the results', because then it applies to the entire index and every visitor, not just sites/pages or visitors you have information from...

###

Sure, you look at the 'external behavior' of visitors when you have access, but you use it to find the behavior of visitors in the results indicating the 'end result' was positive and you write an algo for it, because then it applies the the entire index and every visitor, not only sites or visitors you have access to 'extra information' on.

[edited by: TheMadScientist at 10:44 pm (utc) on Jan 13, 2013]

claaarky




msg:4535772
 10:38 pm on Jan 13, 2013 (gmt 0)

That still means you're trying to predict user behaviour based on a predetermined set of positive behaviour signals, which in turn means you need to accurately know what every user is thinking every day, which is impossible.

Behaviour can be affected by weather, good or bad news about the economy and so on. When behaviour changes it doesn't necessarily change for all sites in the same way either. You just can't predict or guess how user behaviour will change from day to day. There are too many factors affecting people's lives.

claaarky




msg:4535775
 10:45 pm on Jan 13, 2013 (gmt 0)

Sorry, my last post responded to your previous post TMS.

TheMadScientist




msg:4535776
 10:48 pm on Jan 13, 2013 (gmt 0)

You look at the resultset behavior of your visitors for those you have external behavior data on and then you write rules to refine your results based on the findings so it applies to the entire index and all visitors ... When your system does it automatically it's called 'machine learning' ... Go figure.

These have already been linked, but I think they're worth a repeat:

Ray Kurzweil Joins Google In Full-Time Engineering Director Role; Will Focus On Machine Learning, Language Processing [techcrunch.com]

Large scale machine learning systems and methods [patft.uspto.gov]

It doesn't matter if any of us think they can or if we think it's possible or not, they're going to do it and already figured out how, likely quite a while before they announced their wanting to become a 'knowledge engine'.

NP Claaarky ... It helped me 'draw a conclusion' in this one ... [I'm still a bit 'shocked' it's so simple to see now (for me anyway) what they're doing and where they're going] ... BTW, thanks for the questions and 'forcing me to figure out how to say what I've been thinking', I appreciate it :)

TheMadScientist




msg:4535798
 2:07 am on Jan 14, 2013 (gmt 0)

I guess it's probably a good time to 'spill the beans' and let people know my thoughts in this thread and the Business Survival, With Or Without Google Organic Traffic [webmasterworld.com] (I definitely recommend it if you've read this one but haven't made it through the other one yet, because the direction Google's going doesn't look too promising for many sites as 'personalization refinement' gets better.) stemmed from...

The Zombie Traffic from Google and Traffic Shaping/Throttling - Analysis [webmasterworld.com] thread, because when I read through it and realized they're not updating the algo as often and as drastically as traffic is switching on some sites, there are members who saw the traffic patterns described occasionally, there are members who 'got out of it' and there's no one who's got deep enough pockets in there to worry about 'extracting a buck from' ... I wondered why someone would write an algo that 'auto switches' some sites (but definitely not all) between resultsets automatically and why that switching 'ramps up' around the time of updates, and why other sites would see it occasionally for short periods of time, and why sometimes it keeps traffic almost exactly the same between different resultsets (visitor groups) ... The only thing that made any sense to me is the algo is trying to 'refine results and categorize sites' based on Google's visitor behavior, not links, not on page factors (those don't change based on 'split testing visitor types sent'), so visitor behavior had to be the factor and there were (are) some 'it wasn't sure what to do with based on other factors, so it tested behavior' ... Anyway, that thought 'started the chain' that led to the information in the two threads I've posted in recently that deal directly with Google moving (having moved?) in the direction of machine learning (much sooner than later).

I think there was way more info available about how Google actually works in the Zombie thread than most people probably noticed, because to see it I had to 'step back' and think 'Why, how, for what reason? The algo's switching results around by itself on these sites and stopping by itself on some of them and keeping traffic almost exactly the same between resultsets on some ... What's it looking for?', so Huge Thanks to all the 'Crazy Zombie Chasers' who contributed, because those contributions led to some meaningful discussions and forwarding looking conclusions I hope quite a few people will benefit from in the long-run.

bluntforce




msg:4535829
 6:28 am on Jan 14, 2013 (gmt 0)

I believe most of the "Zombie" thread reports/interpretations were mis-interpreted bot traffic not specific to Google.

The mashers/scrapers/analyzers can easily way distort what could be interpreted as traffic, and they'll also change up IPs and User Agents just to make it difficult to slow them down.

If your traffic doesn't convert, is it really traffic, or just someone else looking to make a buck off your effort?

TheMadScientist




msg:4535830
 6:38 am on Jan 14, 2013 (gmt 0)

I believe most of the "Zombie" thread reports/interpretations were mis-interpreted bot traffic not specific to Google.

The mobile and country specific determinations were enough to lead the the conclusions I've posted in the two threads I cited ... Worrying about 'exactly what type of visitor the traffic was from for every specific site' when it wasn't reported, rather than looking at the bigger picture of why it's happening is part of the problem I see with most people being able to figure anything out from it ... Sometimes you have to look without the assumptions and 'just go with what's reported' then try to 'figure things out from there' to see the 'bigger picture' and as I noted in the thread, some people missed it, because they were too busy arguing rather than actually investigating and trying to find answers ... Basically, I went with the 'facts presented' rather than working with an assumption, which is what your view is ... My approach led to some very insightful conclusions and meaningful discussions.

bluntforce




msg:4535835
 6:59 am on Jan 14, 2013 (gmt 0)

"My approach led to some totally insightful conclusions and meaningful discussions"
Were there documentable results?

TheMadScientist




msg:4535836
 7:01 am on Jan 14, 2013 (gmt 0)

BTW: There's not really a likely or reasonable way a bot could 'switch on and off' visits and just happen to oppose converting traffic to keep overall visits static on an hourly (or any other) basis, which means it's very reasonable to conclude your assumption of 'bot traffic that's misinterpreted' is most likely invalidated by the information presented in the thread, but you'd have to read the thread with an 'open mind' to reason through that 'little bit' of information.
.

[edited by: Robert_Charlton at 8:41 am (utc) on Jan 14, 2013]

bluntforce




msg:4535844
 7:23 am on Jan 14, 2013 (gmt 0)

As always:
I've argued that traffic to a given website should come from a reasonable mix of sources.
Overall traffic remaining on a consistent level from that reasonable mix of sources is understandable. When site conversions drop, either on an hourly basis or on a daily basis, the traffic normally leading to conversions needs to be segmented and analyzed.

From the zombie thread, I recall most participants complaining about traffic drops, but it wasn't Google specific, so really could not be considered.

Theory is fine.
Documentable results are what turn theory into something actionable. Other than that, it's just words.

superclown2




msg:4535921
 1:33 pm on Jan 14, 2013 (gmt 0)

I had a number of sites using a Flash clone which, I discovered, stopped the pages from loading in Chrome although they worked perfectly with all other browsers. Despite this all these sites had pretty good rankings in the G SERPs.

The logical part of my brain (which isn't always right) tells me that if G used Chrome for ranking they would be nowhere. Therefore they don't. QED.

claaarky




msg:4535925
 1:51 pm on Jan 14, 2013 (gmt 0)

Superclown, did those sites load partially with some links/options clickable or was it a complete white screen with nothing clickable at all?

How long were those sites in that condition?

scooterdude




msg:4535926
 2:28 pm on Jan 14, 2013 (gmt 0)

After the USA FTC's love letter to Google and any other interested parties

Does anyone think the great G needs:

Independent webmasters goodwill ?
Friendly posters on sites like www ?
Webwide Consensus on what quality content is ?
Acceptance of 3rd party veto over what the can or cannot do with their investments in services offered freely to the public?

Across all the webmasters forums I frequent, there has been a massive drop in experienced posters, but a cosistent infux of new accounts and I'd reckon that up to 95% of the new accounts are geninue newcommers, the rest being the usual mix of sock glove accounts, image makers, re branded posters :)

The end of this era of the web is nigh :)
We have no need of mincing words,

a)Google has a continuing need for us to use Chrome, android and Analytics

b) They continue to finance these products which they can probably never make a direct trading income from

c) They are a quoted company responsible to shareholders

d) In light of a,b,c it behoves us as 3rd parties to presume that they have a purpose for said expenditure, in line with well understood human imperatives, making a buck might well be one, and data mining , or knowledge management has been known to be profitable to different entities at different times, in different places

aristotle




msg:4536354
 2:11 am on Jan 16, 2013 (gmt 0)


Reportedly the main reason that Google created Chrome originally was to be able to collect detailed data on user behavior. Unlike Analytics, Chrome can provide user metrics on any website that Chromes users ever visit, which clearly includes all the websites that get much traffic at all.. It's true that for obscure sites that don't get much traffic, it could take a while to collect a usable sample. But for websites that get significant traffic, which are the sites that really matter for most searches, Chrome can collect a usable sample in a short time. Google would be crazy to ignore such a gold mine of data. As I said, reportedly the main reason they created Chrome in the first place was to be able to collect such data

Someone mentioned the possibility of artificial manipulation. Actually, Chrome data would be much harder to manipulate than the well-known factors that SEOs have been manipulating for years. The only exception might be sites that hardly get any traffic at all, but they will be mostly left out anyway.

I agree that there could be small biases in the data. But I doubt if they're anywhere near as bad as some of the other biases that already influence the SERPs. For example, websites that express minority views on controlversial social and political issues obviously won't attract as many backlinks as sites that represent majority opinion, and therefore will be unfairly demoted by the algorithm.

So to sum up, If Google isn't using user metrics data as part of their rankings process, they're making a sad mistake, and they will eventually be overtaken and then left behind by a search company that does.

bluntforce




msg:4536431
 6:50 am on Jan 16, 2013 (gmt 0)

aristotle,

Perhaps you can help me out.
I don't see anything on:
[google.com...]
that states Google is collecting detailed user data.

Either I've missed the statements where they are gathering data, or you are promoting misinformation.
It's also possible they are collecting information without regard to their privacy policy. I'd think someone would find a reason to jump on that.

indyank




msg:4536439
 7:17 am on Jan 16, 2013 (gmt 0)

Google might or might not have told it anywhere, but think "what is the use case for chrome?" wearing google's shoes and try to list them if you know, pls.

superclown2




msg:4536473
 9:15 am on Jan 16, 2013 (gmt 0)

Superclown, did those sites load partially with some links/options clickable or was it a complete white screen with nothing clickable at all?

How long were those sites in that condition?


They loaded apart from the Flash content but none of the links worked. Most of them had this content for 1 year+.

I'd never checked them in this browser before because I'd avoided Chrome, like I avaoid all Google products. Lesson learned.

Robert Charlton




msg:4536480
 9:30 am on Jan 16, 2013 (gmt 0)

Google has repeatedly stated that it is not using Chrome or Toolbar data in the organic search algorithm. For extended discussion about what that might mean, see this thread...

Matt Cutts: Organic Algo Does Not Use Any Chrome Data
Aug 23, 2012
http://www.webmasterworld.com/google/4487777.htm [webmasterworld.com]

I know that many would like to catch Google with its hand in the cookie jar (pun intended) with regard to Chrome, but I do believe that Google is telling the truth about data from the browser itself.

Most likely, as I suggest in the Chrome Data discussion, they are using cookie tracking, along with a range of other signals as described in Brett's Panda Metric thread (which I reference in the Chrome discussion and am repeating the link here)....

Panda Metric : Google Usage of User Engagement Metrics
April 21, 2011
http://www.webmasterworld.com/google/4302140.htm [webmasterworld.com]

While Brett speculates about the Google toolbar and Chrome browser, he presents strong cases for multiple other signals. In both threads cited above, aristotle strongly leans toward the Chrome browser. I'm leaning much more toward a combination of signals not native to Chrome, and particularly with cookie-based tracking. Google also has location nailed without browser data, either by ISP data for desktop connections, or geo-tracking on mobile devices.

Another thread here worth checking...

Personalized Search Now Default
December, 2009
http://www.webmasterworld.com/google/4037372.htm [webmasterworld.com]

There's much to support that use engagement data need not come from Chrome. I agree with TMS's suggestion that a combination of multiple other sources are more likely to give Google a representative picture of searcher behavior than either Chrome or Analytics would.

indyank




msg:4536490
 9:57 am on Jan 16, 2013 (gmt 0)

Yes I agree with Robert on the cookies stuff or Google's ability to get data thro. other advanced means.

Take an imaginary case like this. Adwords uses chrome data or data collected thro. other means to serve relevant personalized ads to users. Google's search algorithms pick up clues from these ads served on their SERPS (search engine Result Pages). In this case, Google is right in saying they don't chrome data as it is adwords which uses the data directly while their search algorithms get signals from the ads served on their SERPS.

So, yes they might be telling the truth when they state they don't use chrome data. Google is very clever in stating things diplomatically and in a manner that suggests what they say is always true. :)

scooterdude




msg:4536498
 10:41 am on Jan 16, 2013 (gmt 0)

Somebody help me please,which company was it again that paid a few million in fines for some how bypassing privacy settings?

Oh yes, there was another company collecting WiFi data unbeknowst to itself,hmm

claaarky




msg:4536500
 11:14 am on Jan 16, 2013 (gmt 0)

The important point really is whether or not user engagement metrics are being used directly or indirectly in ranking calculations. I say 100% "yes" because it explains how Google are able to achieve what they are doing and what they want to do in the future.

I personally don't care where that data comes from but many people don't even believe Google is utilising data about how real people actually use websites and the first question they ask to undermine that idea is "where do they get the data from then?".

Chrome is the obvious answer because it's so easy to collect the data that way, and was specifically created for the purpose so it's always a strong candidate in discussions. Google have said they don't use Chrome data in the algo but the question in mind is does THEIR definition of the algo include Panda. If not, then they may have chosen their words carefully and could actually be using Chrome data to calculate quality scores which are then fed into the main algo at Panda refresh time. Unless someone specifically asks Google whether Chrome data is used by Panda, that point will never be cleared up for me.

There's also the fact that Bing have openly admitted in the past to using user metrics obtained from Internet Explorer in their algo (when on the same stage as Google who kept tight lipped on the subject). Many believe Bing's results are superior to Google's and IE is still the dominant browser (more data makes for more accurate/meaningful data). There's quite a bit of smoke around the idea of using browser data.

When it comes to privacy, is using anonymous browser data a breach of privacy? I mean, Google Analytics collects website usage data and Real Time shows you where individual users are in the world, how they came to your site and what page they are currently on. It doesn't tell you WHO they are though, so using browser data could be fine as long as it's not personally identifiable.

Check Chrome's terms of use and you'll find things are worded ambiguously. They say they collect various stats but they don't say what (only examples are given).

Cookies can be blocked, so that's not reliable.

Trying to 'guess' the quality of a site and how engaging people will find it based on a robot crawling the site looking for various predetermined signs is a laughable idea in my view. That is an unfathomable task and would be so influenced by Google's own judgement of quality that I don't see that as an option that can even seriously be considered.

Collecting data on the real behaviour of real humans is the only way I can see Google could so accurately nail quality or hope to produce more accurate/relevant results. You can't fool all the people all the time, that's why employing user metrics is such an effective way of weeding out the rubbish.

The only reason for wanting to know where that data comes from is to prove to those who still have their heads in the sand that, since Panda arrived, the way people react to your site has THE biggest influence on rankings now.

aristotle




msg:4536527
 1:38 pm on Jan 16, 2013 (gmt 0)

bluntforce wrote:
aristotle,

Perhaps you can help me out.
I don't see anything on:
[google.com...]
that states Google is collecting detailed user data.

Either I've missed the statements where they are gathering data, or you are promoting misinformation.
It's also possible they are collecting information without regard to their privacy policy. I'd think someone would find a reason to jump on that.

Well Google has said, or not said, a lot of things. What I'm saying is that they SHOULD be collecting and using user metrics data. In fact, I think it should be the biggest "factor" in the algorithm.

Google must have spent quite a lot of time and money to develop Chrome, and I doubt they would have done so without envisioning a purpose for it.

As for any privacy policies, I'm sure that someone at Google can think of a way to circumvent them. For example, maybe all the data could be aggregated into a database that doesn't include anybody's personal information.

diberry




msg:4536584
 4:15 pm on Jan 16, 2013 (gmt 0)

Google just can't be using Chrome data to determine the serps:

Chrome users will be more likely to fit into certain demographics, just like IE, Firefox, Opera etc users might be more likely to fit into different demographics. You'd be skewing the data based on that fact for sure. If you could collect data from a completely random third of Internet users, the aggregate data would be far more representative in my opinion.


Exactly. If you had complete access to the surfing habits of all Democrats in the US, that would be a huge sample group, but it would definitely skew toward the young, toward women and minorities, toward the less affluent, etc.

But Chrome data certainly could tell Google a lot about whether anyone at all is finding what they're looking for. Maybe not on every topic, maybe not every type of searcher, but it would definitely create a huge pile of usable data. And using this would NOT contradict their public statements at all.

It's similar with Analytics - since it's not on every site and the data could be compromised by paid traffic schemes and so on, they can't rely on it for much. But it could show broad patterns that help them test the algos.

(Where I think Analytics data could come in even more handy, however, is in tweaking Adwords to work better for advertisers. As far as I know, there's no ethical reason for Google not to use it this way.)

And TMS' classification of the Zombie thread is bang on. There was plenty of data provided about specific Google traffic shifts. Mobile users, or at least users with tiny screens, seemed to have something to do with it. The data collection was just getting good when the thread was shut down. There was definitely something happening, and it was Google specific, and perhaps if more study had been allowed we'd have gotten past the theories and found a concrete explanation.

bluntforce




msg:4536775
 6:50 am on Jan 17, 2013 (gmt 0)

@aristotle,

Google collects plenty of personal data and they might want Chrome data, but it's one of those lines it doesn't make sense to cross. At least in my opinion.

Drifting into the generic concepts presented in many of the Google threads, it seems a lot of people complain about reductions in SERPs and conversions. There isn't much doubt the internet is continuing to change.

Today, I happened across a regionally focused site that appears to be monetizing with AdSense and cross promoting other sites they own. Meanwhile, I've been establishing well curated information in different regions similar to what they provide. I suspect I will eat their lunch.

I didn't hunt them down or target them, they just happened to be focusing on an area that I'm currently looking at in a bit more depth.

I believe every page I control runs that same risk. There's someone coming along who's going to bring some significant focused energy to that page's niche.

This 54 message thread spans 2 pages: < < 54 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved