|Hypothesis on how Google measures "quality"|
Been thinking a lot about how Google works out what quality sites are (as we all have). Also tying back to a lot of the other advice from people who have recovered from Panda and their strategies in doing so.
A lot of people claim improving your metrics (bounce rates , time on site, etc) are a big part in recovery. Of course, this leads to the usual accusations of Google using Analytics which I just don't buy in to. For one, bounce rates in Analytics are entirely subject to how you have Analytics set up, so these could be quite flattering even if your actual engagement isn't that good. And then there are plenty of sites who still don't have Analytics installed. It just doesn't make sense.
So my theory is that Google is in fact using personalized search to make judgements on quality.
1. New visitor arrives at your site for the first time. Your site is added to their Google search history. So far, no judgement has been made.
2. That same visitor does a search and Google bumps up your site a bit in the results. If the visitor looks at the results and clicks on one further down the list, ignoring yours. What can Google learn from that user-action? Apparently your site wasn't good enough to make the user want to go there a second time.
Multiply that scenario by lots of searchers and you have a pretty reliable way of judging user-satisfaction in my opinion.
Now you might say that the searcher just happened to land on one of your poor pages on an unrelated term and Google shouldn't have sent them there in the first place. To which Google might say - why have these poor pages in the first place? And that feeds into the whole idea that you need to ensure you don't have too many thin and low quality pages, because clearly that will not leave a long-lasting positive impression on the visitor.
It also means that some other fairly hard-to-measure things like design and ux actually do end up having an effect on search positions. Poor design = poor first impression = bad news in search.
So there you have it, a fairly simple way of measuring "quality" or more accurately, "user satisfaction".
Now if only I could think of a good way to test this theory :) I guess the only real option is to just knuckle down and bump up visitor satisfaction and see if it improves search results at the same time!
Its not just personal search.
Remember their browser phones home. So does Android and the tool bar sends something back.
If a brand new spam site can rank for coveted search terms like insurance in a short window of time, I believe that indicates that there is stil way more emphasis on links than anything else... which is why the Penguin update has been and will continue to be exploited by negative SEO and spammers.
|Martin Ice Web|
rango, there is one point that does not match to your scenario. Poeple tend to click on the first 3 results ( >90% of all clicks ) this would lead to the fact that the first three sites would get stronger and stronger no matter if they are relvant for the user.
And one user searches for read apples and one for the red tomatos both get to the same site. One ignores it one not? It depend all of search intent. That is what g* tries to guess.
MC said they getter signals from the big brands. I tend to think that this are signals that are not easy/simple to get.
This Signal must be very strong because if you have it then you are a winner if you donīt have it you are a black hat.
Right after first Panda and penguin this brands showed up and get stronger with every update.
It is amazing to see that one big brand in my niche Shows up for every related query on page #1. This brand has a subdomain with the same widget but without prices but link to the domain. Both Domain/subdomain are ranking very well.
I think we have to look outsite search/user metrics to find this signal.
-established before ecom
|For one, bounce rates in Analytics are entirely subject to how you have Analytics set up, so these could be quite flattering even if your actual engagement isn't that good. |
Totally agree with this. Bounce rate is a very noisy metric -- and can be changed.
Bounce rate is a measure of interaction on the page. You can lower your bounce rate on a page by firing event triggers such as scroll down the bottom of the page. Because the event tracking is deemed another interaction, GA does not read it as a bounce anymore.
We implemented this on our sites where event tracking fires off when a user finishes reading the article and reaches the bottom of the page. From normal bounce rate of 76%, we went down to 42% just because of the event trigger. For us, the conversion rate for users who arrive on the site and completes reading a paper was important for us -- even if there is only a single pageview. So setting a bounce rate this way was ok with us.
|If a brand new spam site can rank... |
Isn't that just a temporary boost?
This discussion seems related to TrustRank, which is an improvement on PageRank by starting with a seed set of trusted sites then calculating a score with a high/low value, with high a score indicating a likeliness of being spam-free. So getting links from sites located in high trust buckets can gain you entry into those trusted buckets yourself.
But then there is also a concept called Topical TrustRank wherein the seed sets are segregated into topical buckets from which the trust rank is then calculated. This supposedly results in a substantial improvement in weeding out spam sites. What this implies for us is that obtaining inbound links from topically related sites that are in high trust sets is important.
But then there are algorithms that calculate statistically typical link patterns. These algorithms are meant to identify pages with abnormal sets of inbound links. So now "quality" takes on the added nuance of relating to natural freely-given citations. The goal becomes to either mimic the natural pattern of inbound links or actually build a set of natural inbound links.
But then that kind of statistial analysis then might include social signals. So now "quality" might now mean a certain amount of statistically typical quantity of social/sharing signals. However, won't these signals also need to be filtered?
|...and you have a pretty reliable way of judging user-satisfaction... |
User satisfaction of what? The hypothesis focuses on measuring the "quality" of the site. But user satisfaction also applies to Google's ability to understand the query. If you type Fishing Flies ito the search box and Google returns websites about flying fish, is the problem with the quality of the websites or the quality of the algorithm?
After the SERPs have been delivered, user satisfaction is a measurement of the quality of Google's algorithm. User satisfaction is not necessarily a measurement of the "quality" of a website. In the SERPs, I believe it's more a judgement on Google's ability.
There's a world of classification, statistical analysis of inbound links and outbound links, identifying non-spam sites and sites likely to be non-spam; by the time you get to the SERPs, it's less a matter of quality of the site and more about the usefulness of the results.
I'm not saying that spam sites don't slip through. They do. I'm simply suggesting that you consider the possibility that "user satisfaction" with the SERPs applies to a broader set of circumstances than just the "quality" of a site.
|consider the possibility that "user satisfaction" with the SERPs applies to a broader set of circumstances than just the "quality" of a site. |
Good point - Still the question remains; how is that broader user experience determined and then injected back in as a ranking factor; man or machine?
Despite the growing popularity of the "user metrics" movement (Google keeping track of how many times someone hits the back button after clicking on my site theory) I still see rows and rows of Googlers sitting in front of machines running queries and grading what is returned.
|I still see rows and rows of Googlers sitting in front of machines running queries and grading what is returned. |
That could explain why some searches yield a mixture of pages from big-name sites (Wikipedia, Amazon, TripAdvisor, etc.) and an almost random sampling of smaller sites on page 1. The biggies get evaluated (and fall into the testers' comfort zones), while only a minority of smaller sites (even the good ones) are lucky enough to get plucked from the virtual hat for grading. (Mind you, this wouldn't explain why blatant spam sites rank high for some searches.)
Still, let's be realistic: The "rows and rows of Googlers sitting in front of machines running queries and grading what is returned" approach isn't scalable. We know that, and--more important--Google knows that.
Google has always said that human evaluators are used for QC checks of the algorithm, not for deciding or directly influencing what ranks and what doesn't. That assertion is a lot more reasonable (and, to me, a lot more convincing) than the notion that Google's data centers are filled with little elves whose collective judgment determines whether an Amazon page outranks John Doe's page in a search on "buy green widgets."
|notion that Google's data centers are filled with little elves whose collective judgment determines whether an Amazon page outranks John Doe's page in a search on "buy green widgets." |
Thank you for the much needed laugh!
Besides, we all know the Google elves spend all their time watching us and reporting to the NSA whether we should get presents or coal in our stockings, right?
|...how is that broader user experience determined and then injected back in as a ranking factor; |
The discussion is about how Google measures quality. To make an analogy, an actor is put through rigorous auditions before being hired. Then they are put through rehearsals prior to making it to the stage. The substance of the work happens BEFORE the actor reaches the stage. If you want to have a meaningful discussion about quality, then you have to go way back before the actor reaches the stage.
The title of this discussion is about quality. Much of the action happens BEFORE the SERPs are displayed. If you want to discuss what happens AFTER the SERPs are displayed, the injection back into the algo is likely to be as feedback about the parts that happen before the SERPs get displayed, but not necessarily as feedback about the quality of the site itself.
1. The methods designed to select for trusted non-spam sites failed.
2. The methods designed to weed out untrusted spam sites failed.
3. A query was improperly answered.
Those are all issues with the algo that happened before a user backs out of a site because of dissatisfaction. So if that gets injected back into the algo, it's going to get revised either at the inclusion point, at the point of exclusion, at the user intent level, or at many points together in order to scale the solution.
Do SERPs bounce around. Yes. But I don't think the algo is targeting one site at a time like the gods on Olympus. That's too simple.
[edited by: martinibuster at 11:34 pm (utc) on Jul 11, 2013]
What Google can measure, is how much time people spend on your website, what they do there, and whether or not they return to the serps. From that we can assume the quality of the website.
I visited a site earlier where I looked at just one page and then for only about 15 seconds.
It told me exactly what I wanted to know.
What does Google infer about "quality" from that 15-second one-page visit?
This is why they launched the Consumer Surveys for Publishers
Google observes the metrics of your best pages when algorithmically adjusting your site.
|If you type Fishing Flies ito the search box and Google returns websites about flying fish, is the problem with the quality of the websites or the quality of the algorithm? |
Yes, I do agree this is a problem. And it possibly also is why so many Panda recovery strategies are focussed on weeding out non-relevant pages and ensuring only the good content is indexed (and for the correct terms). Google has undoubtedly gone from a "we'll sort it all out" approach to a "let the webmaster do the work" approach. And this actually ties in just fine with that for them.
|I visited a site earlier where I looked at just one page and then for only about 15 seconds. |
It told me exactly what I wanted to know.
What does Google infer about "quality" from that 15-second one-page visit?
I don't think you understood my points at all. My theory accounts for this, because in your situation Google would not infer anything about quality. Yet. But when that domain is boosted up the results later - maybe up to 2nd spot - and you decide to skip over it and click on result no.3. Well then Google can infer something. Not just on your actions of course. But on the actions of thousands of people it adds up.
I tend not to think that Google makes any decision based on bounces alone, because it's just far too messy a thing to use.
|rango, there is one point that does not match to your scenario. Poeple tend to click on the first 3 results ( >90% of all clicks ) this would lead to the fact that the first three sites would get stronger and stronger no matter if they are relvant for the user. |
Actually, this matches fine with my theory. Say you have 3 results displayed to the user and result no.1 is a personalised result. If the viewer skips that site and goes to result no.2 then this is a solid indication (particularly if repeated more than once) that result no.1 is not worthy and the user is choosing not to visit it because they don't like the site. So I don't really see the problem. This is really only affecting those sites that are boosted because of personalised queries - not the ones ones that are in 1,2,3 with no boosting going on.
|What does Google infer about "quality" from that 15-second one-page visit? |
Google infers that its Knowledge Thingummy* needs more work, because anything that can be answered in a fifteen-second visit can be answered by the search engine itself without ever leaving the SERP.
* Uhm. Senior Moment here.
|But when that domain is boosted up the results later - maybe up to 2nd spot - and you decide to skip over it and click on result no.3. |
Clickthroughs are influenced by titles, metadescription, and even a lack of metadescription. If a result gets skipped the problem might be with those factors or it could simply be that position 1 and 2 have more compelling titles.
I use certain strategies to increase my clickthrough rates in the SERPs where I'm number three. In those cases clickthroughs are partially because my titles and descriptions are better than my competitors.
I don't deny that testing goes on. But I don't think it's measuring the quality of the site because there are too many other reasons for the results. Too many to draw accurate conclusions.
Here's another way to look at it (and I agree with Martinibuster). When a product flops, it could be:
--It wasn't good, and the few people who bought it said so to their friends
--It wasn't packaged well
--It wasn't placed in the right stores
--It wasn't displayed well in the stores
--It was released in the wrong season, people just weren't ready for it
--A competing product came out at the same time and blew it away at
And so on. There are tons of possibilities, and combinations of possibilities, that could have led to loads of people not even giving the product a chance. The Iphone could have flopped if it had been developed by a company that didn't know how to market it and didn't have Apple's reputation, and whether you like it or not, it is a quality product that has thrilled a lot of people.
So no, I don't think Google even imagines they can determine the "quality" of your website. Their goal is to find the best result for a given user on a given query. Think of Google as a sort of extended personal shopper. A personal shopper goes and buys you several of a thing (say, an outfit) and presents them to you at home. You pick the ones you want to keep and they return the rest. If you're logged in, presumably Google is recording your responses to the various "outfits" to learn more micro stuff about your preferences. When they present you with an outfit that's your favorite color, and it fits, but you still reject it for another, they have to guess: was it the belt? The stripes? The shape? And they can only REALLY even begin to do this if you're logged in and letting them record everything.