homepage Welcome to WebmasterWorld Guest from 54.161.202.106
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 45 message thread spans 2 pages: 45 ( [1] 2 > >     
What user metrics does Google use to determine rank?
diberry




msg:4505616
 8:36 pm on Oct 8, 2012 (gmt 0)

We've talked a lot around here about Google using user metrics to rank webpages. It's certainly an approach that would make sense, but it also opens a few questions. Assuming Google is being truthful when they say they don't use Analytics data to rank web pages:

--What data could they be using to gather user metrics? Adsense? Chrome? Cookies from the Google search page to see how long searchers stay on a SERP before clicking back?
--Exactly what metrics could they be gathering from their sources, and which metrics would they not be able to access?
--Once they gather these metrics, how are they interpreting them? For example, can they tell the difference between a bounce that happens because a visitor got all they wanted from a site and a bounce that happens because the searcher didn't like the page?

 

tedster




msg:4505694
 2:03 am on Oct 9, 2012 (gmt 0)

Assuming Google is being truthful when they say they don't use Analytics data to rank web pages

They also have said they don't use Chrome data. I'm personally rather stymied by the issue at this point.

They could be using click-stream data that some ISP or other sells them, or maybe that they gather from free wi-fi that they provide. I don't know because I've never seen any privacy statements related to those services. They could be using DoubleClick cookies somehow, I guess.

However it certainly is quite a question - even THE Panda question for those impactedd by Panda, I'd say. In particular, I like to contemplate the questions that Amit Singhal and Matt Cutts mentioned when Panda first launched:

Cutts: There was an engineer who came up with a rigorous set of questions, everything from. "Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?" Questions along those lines.

Singhal: And based on that, we basically formed some definition of what could be considered low quality. In addition, we launched the Chrome Site Blocker [allowing users to specify sites they wanted blocked from their search results] earlier, and we didn't use that data in this change. However, we compared and it was 84 percent overlap [between sites downloaded by the Chrome blocker and downgraded by the update]. So that said that we were in the right direction.

Wired.com: But how do you implement that algorithmically?

Cutts: I think you look for signals that recreate that same intuition, that same experience that you have as an engineer and that users have.

[wired.com...]


You may also find it helpful to revisit our original Panda thread here [webmasterworld.com]

indyank




msg:4505727
 3:36 am on Oct 9, 2012 (gmt 0)

They could be using click-stream data that some ISP or other sells them, or maybe that they gather from free wi-fi that they provide. I don't know because I've never seen any privacy statements related to those services. They could be using DoubleClick cookies somehow, I guess.


That might be true.

I think I referred a link to a research paper here several months ago which tells us about how they could compute user metrics like bounce rate directly from the content! If what was stated in that research paper was true, they could very well be arriving at all other relevant metrics directly from the content! Then they could be using all the indirect influencers(signals) to validate the metrics they arrived at.

LostOne




msg:4505781
 7:01 am on Oct 9, 2012 (gmt 0)

Tedster and Others:

Iím very curious about smaller sites that were affected by Panda, particularly those that donít have the branding power like that of big brands. In hearing more reports of Panda recoveries, are we looking more at sites that have offline visibility and branding? Or can we safely say itís a broader spectrum of recoveries?

An example would be smaller online only sites. Naturally the definition of smaller is up for debate. Occasionally I hearÖĒwe recovered, or a client recovered and their traffic is better than before Panda.Ē I realize itís never going to be the same, but who are these sites? Or do the smaller sites actually have a future?

Thoughts?

Sgt_Kickaxe




msg:4505788
 7:57 am on Oct 9, 2012 (gmt 0)

My thought is that smaller sites have less leeway in making mistakes. Things such as horrid markup, a poor link graph, a high bounce rate with low interaction and *gulp* too many affiliate links likely hurt a small site to a larger degree than a more established and well known site.

When you're just starting out the little things matter more, imo, because you haven't built any positives(esp incoming links) yet to counter-balance the negatives. The good news is that newer sites likely have fewer off-site issues to resolve.

deadsea




msg:4505869
 10:38 am on Oct 9, 2012 (gmt 0)

I'm convinced that they use bounce-back rate. IE, the user clicks back to the SERPs quickly and clicks another site instead, or refines their search.

We know they have the technology to track this. They can even differentiate between users who open multiple tabs vs users using back button. I've seen them put a notice in the SERPs under sites that I have bounced back from saying to effect "do you want to report/block this site?"

I've heard that this is a "noisy" signal. But I'm sure they use it in the extreme cases where they have enough data and that the rate is well out of the ordinary.

We started collecting data on users interacting with a page in any way. We knew whether they scrolled to the end of the article, clicked to other pages, click on ads, moved a map, played a video, etc. When users didn't do any of these things we assumed they used the back button. We found huge correlation between this metric and the rankings of the pages for their targeted keywords.

We also found huge correlation between the amount of content on the page and bounce back rate. When there was minimal content (just a product name, and a bunch of "be the first to...") the bounce back rate could be 90%. When we had a full complement of content (reviews, prices, places to buy, photos, videos, professional review links) the bounce back rate could be as low as 15%.

We concluded that either Google had a very sophisticated algorithm to measure the amount of content on a page, or that they were doing a very straightforward measure of bounce-back and using that heavily to rank web pages for queries.

atlrus




msg:4505876
 11:14 am on Oct 9, 2012 (gmt 0)

Assuming Google is being truthful when they say they don't use Analytics data to rank web pages


Assuming that gives google much more credit than they deserve. Google hasn't done anything from the goodness of their heart and Analytics is just another tool to make Google's life easier, not yours. Do you honestly believe that Google is sitting on piles and piles of this golden data and would not use it?!?

The most credit I can give them is to assume that Analytics may not affect your website's ranking directly, but they at the very least use the data to create "blueprints" of good vs bad websites.

On a funnier note:

Would it be okay if this was in a magazine? Does this site have excessive ads?


^^^ is this real life? :)

claaarky




msg:4505885
 11:42 am on Oct 9, 2012 (gmt 0)

I think there's two separate issues here:-
1) Panda
2) The main algo

I remember reading Matt Cutts' statement about Chrome data not being used in the main algo and thinking he didn't say anything about Panda.

The way I see it, Panda produces a site rating that ranges from -100% to +100% which is calculated based on the user metrics of your site compared to others in your niche. Sites can be promoted or demoted by it, depending on quality. So the main algo does it's normal thing and works out your standard ranking position (based on relevance, click through rate, bounce, etc.), then the Panda rating boosts or demotes your ranking by your site rating, pushing high quality sites up and poor quality sites down.

If that were the case, then Matt would be technically right in saying they don't use Chrome data in the main algo, but I suspect they do indirectly via Panda.

If not Chrome, then it will be something that tells them Exit Rate, Pageviews, probably time on page as a minimum. Exit Rate tells you so much about a page and a site overall that, when compared with other similar sites, it becomes a reliable indicator of which site people like most. It's THE 'human intuition' metric.

The other thing that points me back to the browser is Google's move into mobile. What if you need data from a browser for Panda and see more and more traffic moving to mobile devices where Safari is prominent. What if the number of Safari users went to 70%....where would your quality based ranking system be then?

It would be interesting to see MC's response if someone asked him whether Panda used Chrome data.

LostOne




msg:4505895
 12:13 pm on Oct 9, 2012 (gmt 0)

which is calculated based on the user metrics of your site compared to others in your niche


Hmmm. I wonder if this niche would also include an ecommerce site and an informational site(me) in the same industry?

Iíve been on both sides.

Ecommerce(2004-2008), same site that saw 7-8 page views per visitor.

Information, same site

(2002-2004) page views around 4.5
(Present) page views around 2.2

Incidentally my bounce rate is now at 75% but mentioned in previous threads, time on page is high.

claaarky




msg:4505911
 1:25 pm on Oct 9, 2012 (gmt 0)

LostOne, this is off topic slightly for this thread but I think it depends on what the the biggest players in your niche are doing. But in my niche nobody does information and ecommerce together unless the information pages encourage visitors to the ecommerce pages.

You have a high bounce rate which suggests you're not successfully driving visitors to your money pages. If those pages also receive a lot of Google traffic, they could be your problem.

jimbeetle




msg:4505963
 3:17 pm on Oct 9, 2012 (gmt 0)

We have the thread Ted started a couple of weeks ago that deals with the same subject:

From a Former Google Research Scientist [webmasterworld.com]

diberry




msg:4505965
 3:21 pm on Oct 9, 2012 (gmt 0)

We started collecting data on users interacting with a page in any way. We knew whether they scrolled to the end of the article, clicked to other pages, click on ads, moved a map, played a video, etc. When users didn't do any of these things we assumed they used the back button. We found huge correlation between this metric and the rankings of the pages for their targeted keywords.

We also found huge correlation between the amount of content on the page and bounce back rate. When there was minimal content (just a product name, and a bunch of "be the first to...") the bounce back rate could be 90%. When we had a full complement of content (reviews, prices, places to buy, photos, videos, professional review links) the bounce back rate could be as low as 15%.

We concluded that either Google had a very sophisticated algorithm to measure the amount of content on a page, or that they were doing a very straightforward measure of bounce-back and using that heavily to rank web pages for queries.


Great analysis! That's really useful information - thank you for sharing.

I think there's two separate issues here:-
1) Panda
2) The main algo


And Penguin, too. For all we know, both the zoo animals could be using a data source (or sources) that's not made available to the main algo.

Hmmm. I wonder if this niche would also include an ecommerce site and an informational site(me) in the same industry?


That's just the sort of thing I'd love to figure out. If Google does compare sites it groups into niches, how does it define the niches? Possibly not at all the way we do, or even the way searchers seem to.

BTW, do we actually have any indications that Google is comparing sites within niches? It makes total sense, but I'm just wondering if there's any way we can back up this theory with data or something Google has said.

tedster




msg:4505984
 3:57 pm on Oct 9, 2012 (gmt 0)

do we actually have any indications that Google is comparing sites within niches?

There have been some comments from Google that different types of sites would be judged differently - and that seems to be what happens in all other areas of the algorithm anyway, so it it only to be expected in whatever use they make of user metrics.

I'll be on the look for an exact quote I can reference about this.

Zivush




msg:4506003
 4:43 pm on Oct 9, 2012 (gmt 0)

Two 'direct' metrics that are easy to asses by any Search Engine are:
1. SERP CTR (Click-Through Rate (CTR) from the SERPs themselves) can be seen in WMT.
2. Dwell time (how long it takes to return to a SERP after clicking on a result).

Nostalgic Dave




msg:4506015
 5:41 pm on Oct 9, 2012 (gmt 0)

@Zivush... If Google were to use these 'direct' metrics that you mention, then it seems that it would take a lot of data to get a reasonably accurate idea about a pages quality. If pages receive very little traffic, then the metrics could be easily skewed and are not reliable. Would they then not use the data? Or give a page a false positive (or negative) boost in ranking?

scooterdude




msg:4506017
 5:47 pm on Oct 9, 2012 (gmt 0)

I think the post above neatly illustates some of the issues you(I) have with the current situation in search , if you are not an established site

The mechanisms set up to promote the best of heavily trafficed sites weigh very heavily against low traffic sites

coachm




msg:4506019
 5:59 pm on Oct 9, 2012 (gmt 0)

I'm convinced that they use bounce-back rate. IE, the user clicks back to the SERPs quickly and clicks another site instead, or refines their search.


I have no idea if they are doing that. It would certainly account for the terrible search results. Bigtime. I wonder why people don't talk about the issue that ALL user metrics measure the effectiveness of the REFERER. If the referral sends the visitor to a page and properly matches and represents the site, the metrics will be better than if the referer doesn't.

It's that simple. User metrics for ranking measure GOOGLE as much as they measure site quality.

That, and the switch to domain level rankings rather than page level, has trashed the search engines, if that's what they are doing.

deadsea




msg:4506023
 6:10 pm on Oct 9, 2012 (gmt 0)

I'm pretty sure they use click through rate and bounce back rate on a per-term basis. Unhappy users for one term shouldn't hurt the page's ranking for other terms.

Although Panda appears to me to have site level user satisfaction metrics weighted heavily into it. Google seems to be saying "Users don't appear to be satisfied on this site no matter what terms we rank them for, let's flush this site down the toilet."

Ralph_Slate




msg:4506026
 6:16 pm on Oct 9, 2012 (gmt 0)

Google seems to be mimicking the way big box stores stock their shelves - they mine their sales data and prune any items that don't get a certain share of the sales. Sometimes this means that the big box store carries just one brand of merchandise in a particular category.

The only way you can get featured is to pay for shelf space.

Zivush




msg:4506032
 6:30 pm on Oct 9, 2012 (gmt 0)

@Nostalgic Dave
How can we know?
Practically, as a site owner, if you have a choice to improve your site performance at the expense of reducing site's revenues (less ads), would you go for it?
My gut feeling is - Yes. As a long term strategy, it must benefit.

jimbeetle




msg:4506125
 8:07 pm on Oct 9, 2012 (gmt 0)

I'm pretty sure they use click through rate and bounce back rate on a per-term basis.

Over the past few days that's the only way I've come to believe G can use bounce rate. That and segmenting industries and sites into different buckets.

And here's where I think the "fresh boost" factor really plays a role: surface a new page then measure clicks and bounces.

diberry




msg:4506198
 10:56 pm on Oct 9, 2012 (gmt 0)

I wonder why people don't talk about the issue that ALL user metrics measure the effectiveness of the REFERER. If the referral sends the visitor to a page and properly matches and represents the site, the metrics will be better than if the referer doesn't.


This is a really important point. Any user metric is really going to be measuring three things: how well the searcher phrased the query, how well Google matched it, and how good the resulting webpage is as a result. I don't see how any user metric could reveal which of those three things (or combination of any or all) is the problem in an unsuccessful search.

Google is smart enough to think of all this, so I wonder how they try to account for any of it.

tedster




msg:4506277
 2:50 am on Oct 10, 2012 (gmt 0)

When you aggregate a lot of data, patterns emerge that are not at first glance obvious just by describing what you're going to look at. For example, when billions of Tweets are mined, spam accounts can stand out just by their content. You'd never guess that from thinking about a small sample, however.

Any user metric is really going to be measuring three things: how well the searcher phrased the query, how well Google matched it, and how good the resulting webpage is as a result.

Yes, if Google somehow mismatches the query and the web page, that can spell trouble for the site involved - and yes, we do see "some" of that pattern. I don't think that happens intentionally, however, even though some have suggested that idea.

How well the user phrased the query? That's going to be a wash over millions of data points - maybe even at a lower level. And Google Suggestions tends to confine that issue a good bit, as well.

How good the web page is among other possible results? Measuring that is clearly Google's goal. When you go down the long tail, they are not always doing well, at least so far. But it's still a lot better than I would have thought possible through machine learning.

indyank




msg:4506293
 3:30 am on Oct 10, 2012 (gmt 0)

Ok. Here is the link to the reasearch paper by D. Sculley, Robert Malkin and Sugato Basu from google and Roberto J. Bayardo from MIT.

[eecs.tufts.edu ]

It is about predicting bounce rates in sponsored search advertisements but the same could be easily extended to predicting bounce rates for search result pages. There could be something similar for other user metrics as well.

[edited by: indyank at 4:08 am (utc) on Oct 10, 2012]

indyank




msg:4506297
 3:49 am on Oct 10, 2012 (gmt 0)

I am not saying that they are using those predictive methods for arriving at user metrics but we never know.

However, the research paper is a good read as it tells us how they are convinced user metrics like bounce rate are good tools for judging user satisfaction.

Also it is an interesting read on how mean bounce rates vary by language and by particular keyword.

"mean bounce rates vary significantly by particular keyword. Navigational queries, such as those
containing specific business names, result in very low bounce
rates. Commercial terms, such as books and flights, also
have low bounce rates. Entertainment oriented terms such
as games and chat exhibited much higher bounce rates.
In general, there is a rough inverse relationship between
keyword popularity and mean bounce rate for that keyword.
This may be because the greater competition for these more
popular keywords creates a need for these competing adver-
tisers to achieve higher standards of quality."

I think I am repeating what I posted here more than an year ago, but since we are again trying to find answers on what user metrics google might be using and how they could be collecting them, I though posting it again would be helpful.

viral




msg:4506304
 4:21 am on Oct 10, 2012 (gmt 0)

We know they use time on site to do some things have a look here see this searchland article where they use "time on site" to generate different results when you go back to the google [searchengineland.com...]

If they are using time on site (ie bounce rate) for that who knows what else they are using it for.

claaarky




msg:4506332
 7:24 am on Oct 10, 2012 (gmt 0)

Exit rate does reveal when a page is receiving traffic that is either poorly targeted or poorly handled when the visitor arrives. It's how I've identified my high traffic problem pages for Panda.

When I look at the keywords generating traffic it told me I had been targeting the wrong terms (my page was well optimised for my target phrases but didn't actually deliver the goods for visitors).

diberry




msg:4506469
 3:04 pm on Oct 10, 2012 (gmt 0)

Also it is an interesting read on how mean bounce rates vary by language and by particular keyword.


That's very interesting. Definitely makes it sound like they are aware that a "good" bounce rate in one niche might be bad in another, and take that into account. Which is good news, since the only way webmasters can lower bounce/exit rates on a quick answer type site would be to make it harder for visitors to find what they want.

Exit rate does reveal when a page is receiving traffic that is either poorly targeted or poorly handled when the visitor arrives. It's how I've identified my high traffic problem pages for Panda.


But, as with bounce, a "good" exit rate is going to vary a little by type of query. For example, I'm often happy with WikiAnswers' quick answers, but I always bounce right back out whether I found what I wanted or not. Why don't I bookmark it, thus providing it a metric that would balance out my high bounces? Because it always turns up at the top of the SERPs if it has the answer I need. I just rely on finding it in there. Which is another aspect of the whole feedback loop we were talking about earlier.

That said, I agree with you that exit rate is a VERY important stat for webmasters to look at. On most sites, it indicates visitors who were, at best, not overly wowed by your page. Sometimes this is due to a reason beyond your control - I've found Stumbleupon, for example, creates high exit rates on good pages. But most of the time, exit rate does indicate there's something you could be doing better.

deadsea




msg:4506536
 3:49 pm on Oct 10, 2012 (gmt 0)

Its very important to make a distinction between single page view "bounce rate" as measured by Google Analytics, and "did the user find what they were looking for on a single page".

Its often makes sense to instrument extra tracking so that single page user behavior like the following doesn't count as a bounce:

1) An article: The user scrolls to the end of the article (indicating that they have likely read it)

2) Interactive / AJAX content: the user views one page but interacts with the game, tool, map, or calculator in some measurable way.

3) External links / Ads: user clicks to off site to find what they want rather than returning to the SERPs.

The goal is to satisfy the user and prevent them from returning to Google to go somewhere else. The following "improvements" for bounce rate are counter-productive:

1) Splitting an article over two pages so that users have to hit two pages (and not get counted by a bounce by analytics) to read the whole thing. This just frustrates some users who then use the back button. Instead try to measure time on page or scrolling to the bottom of the article instead.

2) Making content more server side interactive instead of using javascript (again so the user has more page views to accomplish the same task). For example, converting a calculator from being powered by javascript to a form submit where the answer is computed server side.

3) Removing links/ads from the page. If users click on them, they should stay. Instead of removing, clicks on them should be tracked.

I find Google Analytics very frustrating with its primitive notion of bounce rate. I have a site where I try to satisfy the user in one page view through interactive content. On this site, a higher Google Analytics bounce rate is better because it indicates that users are landing on the correct page.

I worked on a big website with tons of content and 7 pages per visit. Parts of the marketing department were fighting because one part was trying to reduce bounce rate and the other was trying to increase ad click through. When you measure ad clicks as a "bounce" it doesn't serve your business correctly.

tedster




msg:4506718
 3:40 am on Oct 11, 2012 (gmt 0)

To do good analysis on your own bounce rate, your really do need to do something to filter out a bit of the noisiness in this metric - at least in its raw state. Google also knows this. The evidence? They even created a tool you can use with Google Analytics to filter out pages with a long time-on-page, and you set the threshold to suit your own analysis.

See Tracking Adjusted Bounce Rate In Google Analytics [analytics.blogspot.com]

This 45 message thread spans 2 pages: 45 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved