Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

The Problem with Independent Authority Metrics and Google SEO

         

martinibuster

4:15 pm on Oct 30, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



This article describes a concern with relying on metrics that purport to show authority metrics. The industry increasingly relies on it but there may be compelling reasons to not rely on those metrics. Here are some thoughts for your consideration.

Correlation is not a ranking factor
In a nutshell, third party Authority metrics are created with data that correlates with sites that tend to rank well. However these correlations aren't necessarily factors that made them rank well.

For example, let's examine a hypothetical situation of sites that tend to rank well and also have large amounts of links and Facebook shares. That's a correlation. The Facebook likes didn't cause a site to rank well. The Facebook likes and large amount of links aren't necessarily related to each other. It may be simply that a commercial site with a large amount of links and is SEO optimized for ranking tends to rank well and often has a lot of FB likes (because they are SEO optimized).

Can you see how correlations can be a rabbit hole of speculation? What's particularly troubling is that for every article published about correlated ranking symptoms there is virtually no citations of scientific researchor filed patents to show that the claims are potentially true. One can make opinions and conspiracies about a wide range of potential ranking factors but if you can't cite any research to prove that this has even been researched, much less implemented at Google, then it's just air coming out of someone's mouth.

Let's return to the hypothetical situation. If we were to deduce that the high rankings were at least partially due to the high amount of links then we would have a strong chance of being correct. The reason is because there is substantial scientific documentation that search algorithms depend at least partially on inbound links for ranking websites. That's more than just a simple correlation because there is a scientific citation to back up the assertion.

Unfortunately, many of the ranking factor correlations one reads about do not feature such citations.

Questionable metrics used to calculate Authority Metrics
Some of the factors that go into calculating Authority Metrics are questionable. For example, some use a version of Seed Set Based TrustRank, but that methodology has been scientifically proven to be on shaky ground. [thesempost.com] This methodology was shown to be flawed many years ago, shortly after the Seed Set Based TrustRank method was proposed in a scientific research paper. Why do companies continue to use a method that was shown to be flawed? Are you wise to trust a metric that uses a compromised methodology?

Correlation does not imply causation
An issue of concern is that many popular metrics are measuring correlations. This is a profound flaw. Correlations are a notoriously poor statistical signal for understanding an event. A better method would be to simply understand the science of Information Retrieval. It's taught at universities. There is no need to wander in the dark like the folk tale of the blind men touching an elephant, theorizing that the elephant is like a tree or a snake, depending on what part they are touching. Third party metrics that measure correlations are similar.

Matt Cutts, senior Google engineer [searchengineland.com]
Moz published a story today named Amazing Correlation Between Google +1s and Higher Search Rankings in which Matt Cutts responded to in Hacker News thread saying, “correlation != causation.”


Gary Illyes via Twitter [twitter.com]
Gary's being humorous here. However it can be seen as a reflection of his frustration with SEO's tendency to seize on false correlations.
þ
@methode I went to NYC, weather turned bad. Came to Las Vegas, started to rain and it's cold. Causation or correlation?


Wikipedia [en.wikipedia.org]
Correlation does not imply causation is a phrase used in statistics to emphasize that a correlation between two variables does not necessarily imply that one causes the other... For example, in a widely studied case, numerous epidemiological studies showed that women taking combined hormone replacement therapy (HRT) also had a lower-than-average incidence of coronary heart disease (CHD), leading doctors to propose that HRT was protective against CHD. But randomized controlled trials showed that HRT caused a small but statistically significant increase in risk of CHD. Re-analysis of the data from the epidemiological studies showed that women undertaking HRT were more likely to be from higher socio-economic groups (ABC1), with better-than-average diet and exercise regimens. The use of HRT and decreased incidence of coronary heart disease were coincident effects of a common cause (i.e. the benefits associated with a higher socioeconomic status), rather than a direct cause and effect, as had been supposed.

martinibuster

5:48 am on Oct 31, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Just to clarify, there are many high quality tools that provide data that are important for understanding the quality of a website. Majestic's Topical Trust Flow [blog.majestic.com] is a fantastic picture of the relevance quality of the links pointing to a website. It can quickly give you a good idea of how topically relevant a site is.

Another tool that I am happy with is LinkResearchTools [linkresearchtools.com]. I have used the full suite of tools and in my opinion I feel it's one of the most comprehensive tools I've ever used. Even the basic version offers useful data about inbound links, including meaningful statistics that relate to statistical analyses of linking patterns both spammy and natural.

I like that these tools provide useful information about topical relevance and inbound linking patterns, data that can be used to help make an informed decision.

Maybe it's just me, but I would rather see the data and reach my own conclusion than rely on hidden metrics and a secret methodology, some of which might be based on flawed correlations. I just need the data. I don't need for anyone to make the decision for me. Do you?

Walt Hartwell

6:35 am on Oct 31, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I don't know that I've ever seen a bot from linkresearchtools.com

So their data interpretations, topical relevance and inbound linking patterns comes from where?

Wilburforce

10:21 am on Oct 31, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



hidden metrics and a secret methodology, some of which might be based on flawed correlations


That looks like a pretty good summary of the way Google works these days.

Analytical tools address questions we cannot answer directly.

If you want to know how fast somebody can run, the best way to find out is to get them to run as fast as they can and measure it.

However, it is often difficult or impossible to use direct methods like this to answer questions, and where only indirect data - whatever the tools measure - are available, statistical analysis can be used to quantify the inaccuracy of the data as a predictor of the thing you want to know. Statistics can be summarised as an aid to decision making in uncertainty, or as quantifying what we don't know. If all you know is shoe size, you don't know who the fastest swimmer is, but it is more likely - not certain - to be someone with a larger shoe size.

In many ways, whether the relationship in any particular correlation is causal or not is something of a red-herring: the correlation can still be useful as a predictor of known inaccuracy when no other information is available.

A common mistake, however, is not getting the statistics or causality wrong, but the language: researchers often take a word in common use as an operational definition, and then confuse their operational definition with the common understanding of the word. Authority tools DO NOT MEASURE AUTHORITY, either as Google uses the word, or as anyone else does. They measure a set of criteria and arrive at a combined score that they give the name "authority".

How useful they are is obviously debatable, but most will do better than chance as predictors, so are useful if chance is the only predictor available. Most of us here, I hope, can do a little better than chance, so that limits their usefulness.

For myself, I am quite happy to see how my site compares with my sector's competitors on any measure (although less enthusiastic if I have to pay for it), but very reluctant to make sweeping changes to my own site - or go on a wild backlink-gathering mission - just because one site or another fares better according to any single tool. It doesn't matter how much "authority" my site has if it is not on page 1. If it isn't, all any tool can tell me is what might be wrong with it.

aristotle

12:57 pm on Oct 31, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just insert Google's Author Tag into the head section of all your pages. This will give your site instant authorirty. :)

FranticFish

1:19 pm on Oct 31, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So their data interpretations, topical relevance and inbound linking patterns comes from where?

@Walt Hartwell - I asked them that myself when I was shopping around for a link data provider. They wouldn't tell me.

Walt Hartwell

5:54 am on Nov 2, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Years ago I was analyzing real estate data and attempting to dial-in the root sources of the information. All very secretive and nobody was going to divulge what their data source was. Release dates was what eventually determined where the original source was.

I was hoping for a starting point.

Apologies to martinibuster for an unintended thread hijack.

martinibuster

12:43 pm on Nov 2, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you want to know their data sources, just Google:
linkresearchtools data sources.
Everything you need to know is there.

There are no bots. They have something like 24 backlink sources. They then apply their software to slice and dice the backlink information anyway you need it, by keyword, TLD, etc. They also present the information with useful data points. And that, for me, is useful. I suspect not everyone understands how useful some of the data LRT provides really is. Even in their entry level plan there's enough data in there to form useful observations about qualities of a site. It's all about the data.

Authority Metrics are another matter.

Nutterum

1:47 pm on Nov 2, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



For something _that_ expensive, I don't see much value. Ahrefs does pretty much the same for less.

martinibuster

2:20 pm on Nov 2, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Well yeah. It's all about the data. I'm not promoting LRT or Majestic as the only data providers worth using. I'm pointing out that it's about the data, not necessarily a third party authority metric. Fairly often it's happened that someone casually referenced DA, PageRank, or Alexa rank during the course of a conversation about strategy and when I call attention to the shortcomings of those metrics I always get the "yeah I understand but it's convenient apology." But it's not a matter of convenience if you're dragging those metrics into a conversation about strategy and making strategic business decisions based on those metrics.

PageRank, which originated out of Google itself, was the ultimate metric. But even that metric was flawed and absolutely 100% unreliable for basing strategic decisions on. It was purposely created by Google to be unreliable for SEO purposes. Yet people kept basing important business decisions off that metric anyway. Now we have third party authority metrics, some of which apparently aren't even based on solid information retrieval science but rather what people on blogs believe.

I've read the explanations of what the metrics are based on and there are no scientific citations. At best there are references to Bill Slawski's blog, which is in my opinion the most authoritative source of information about search engine patents. But if someone is providing authority metrics, shouldn't that be based on actual science as taught in universities today? So really, you want to put your faith in those metrics?

I put my faith in data.

I appreciate how Wilburforce stated it (thanks for taking the time to write that post! :) )

Authority tools DO NOT MEASURE AUTHORITY, either as Google uses the word, or as anyone else does. They measure a set of criteria and arrive at a combined score that they give the name "authority".

How useful they are is obviously debatable, but most will do better than chance as predictors, so are useful if chance is the only predictor available. Most of us here, I hope, can do a little better than chance, so that limits their usefulness.

FranticFish

2:32 pm on Nov 2, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm pointing out that it's about the data, not necessarily a third party authority metric

For me too, which is why the source of that data was and is important to me. LRT currently divulge 5 sources on their 'about' page. Back in 2013, when I was looking and enquiring, they didn't mention one.

Nutterum

9:39 am on Nov 3, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Yeah but the entire world is based on un-authoritative authority aggregators. I am coming from the financial world where every god-damn-tool for predictions and modeling was based off the the main three models invented in the late 70s. Even in the entertainment industry you have "views" that are authority generated. The TV equivalent of kissmetrics and mixpanel. It's all about the "relevant sample size" in today's business world. And digital marketing is no different.

Are they useful as predictors and influencers? - YES.
Should we use such authority tools with a grain of salt? - YES.
Are the the most correct way to have broad view of the vertical we are? - In the (not provided era) do we have any other choice?