Welcome to WebmasterWorld Guest from 34.229.126.29

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's Knowledge Graph:Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources

     
3:25 pm on Mar 3, 2015 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:26109
votes: 943


When Google's Knowledge Graph first appeared in the SERPs many webmasters and marketers cried foul. In some respects they were justified: Loss of SERPs real estate, and greater emphasis on what Google assesses as "the answer." It wasn't to everyone's liking, but it's the future of knowledge search, so we'd better get used to it.

We're constantly having discussions, both positive and negative, about link development, and how Google's Knowledge Graph used links for credibility.

While many marketers stuck to the premise of building links in their thousands, it's become even clearer that link quality far outweighs link volume, and savvy marketers have been on this route for many years. Webmasters are helping Google improve the quality threshold and it's goal of quality through the disavow tool. If a webmaster agrees to disavow a link, it's confirming Google's assessment. In rare instances, i've disavowed links that were not highlighted by Google, but, upon assessment, were questionable.

The possibility of the SERPs changing from significantly link-based, to greater emphasis on trustworthiness-based might be closer than some may think.

Here's a paper on "Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources"
The quality of web sources has been traditionally evaluated using exogenous signals such as the hyperlink structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is considered to be trustworthy. The facts are automatically extracted from each source by information extraction methods commonly used to construct knowledge bases. We propose a way to distinguish errors made in the extraction process from factual errors in the web source per se, by using joint inference in a novel multi-layer probabilistic model. We call the trustworthiness score we computed Knowledge-Based Trust (KBT). On synthetic data, we show that our method can reliably compute the true trustworthiness levels of the sources. We then apply it to a database of 2.8B facts extracted from the web, and thereby estimate the trustworthiness of 119M webpages. Manual evaluation of a subset of the results confirms the effectiveness of the method. Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources [arxiv.org]


Here's the full PDF file Knowledge-Based Trust: Estimating
the Trustworthiness of Web Sources
[arxiv.org] This really is worth you reading and understanding, and i'm happy to discuss this topic.

I doubt links will ever go entirely from Google's algorithm, but i'm pretty certain that it'll take less and less emphasis as the Knowledge Graph becomes better, and as it moves into other sectors of the SERPs.

There's a great discussion on "How would you know if links were no longer important to Google? [webmasterworld.com]"
By establishing the efficacy and value of links, and their importance to Google, we're starting to be able to identify the impact of any Knowledge-Graph based trust.

Many have talked about TrustRank, Knowledge-Based Trust, for some time, but, surely, this, in whatever form you wish to describe it, is a way to measure and to ultimately to assess a site, and then a page.

We've seen Google's Panda filter hit "thin sites." How about taking the results of that filter, improving it, honing it, and creating a database of the top authorities, then using that to assess whether a link is credible, or trusted. It's not unknown for search engines to have a trusted source of some sort. Looksmart, Inktomi, and even DMOZ played a part in adding some form of trust to the various search engine databases. The idea being if you got into that trusted database you weren't a crash-and-burn site. Those systems failed, for all kinds of different reasons, and not entirely as a result of the Internet and web going through its growing pains. It's still going through those growing pains, and i'm sure there will be continued and ongoing experimentation to provide better quality search, and to deliver what the user needs. Remember, Google wants what the user needs, not the webmaster ranking number 1 in the SERPs.

We're about to see Google initiate its "Smartphone" update, which rolls out on April 21. [webmasterworld.com] It's being tested right now with the labels being applied to smartphone SERPS. If your site appears in the smartphone SERPs with the wrong kind of label, you're going to get a greater impact of that after that date. Now is the time to be working on fixing that.

We're just reaching a new phase in search, imho: These include new ranking signals of knowledge-based trust, and the expansion of Google's Knowledge Graph into other sectors, and the diversification of desktop and smartphone SERPs, which, up until now, have been very similar, and, don't forget local. Local is going to continue to become more important, imho.
4:01 pm on Mar 3, 2015 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:May 24, 2012
posts:648
votes: 2


If Google can't reliably identify trustworthy links, why would we think they can reliably identify trustworthy content in some other way?

This also has a potential downside in further pushing the envelope on the "Filter Bubble". There's a great TED talk on that by Eli Pariser.
4:14 pm on Mar 3, 2015 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:26109
votes: 943


This one, rish3? [ted.com...]

I remember watching that video.
4:29 pm on Mar 3, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3421
votes: 747


If Google can't reliably identify trustworthy links, why would we think they can reliably identify trustworthy content in some other way?


Google is pretty good at identifying trustworthy links, I suspect. The greater challenge is in identifying untrustworthy links, but they've been getting better at that, too.

More to the point, why would anyone spend time or money on trying to fool search engines into thinking that bad content is good? Wouldn't it be more cost-effective (and more effective, period) to simply produce content that passes the "trustworthiness" test?
4:30 pm on Mar 3, 2015 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:May 24, 2012
posts:648
votes: 2


@engine This one, rish3?


Yes, that's the one. He goes pretty deep on the personalization aspect creating "filter bubbles". But, at a higher level, his point that algorithmic filtering has potentially massive issues applies. In this case, it would be sort of the opposite direction...filtering out potentially good information from everyone.

There's more than one problem with this "verified facts rank better" approach, but the most obvious is that it could suppress new information that might refute the current consensus of an existing "fact".

I'm sure Google has thought about this, and has some ideas on working around it, but I am skeptical. They've spent a LOT of time on figuring out how to tell what a trustworthy link is, but they are still getting gamed, in a big way...right now.
4:33 pm on Mar 3, 2015 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:May 24, 2012
posts:648
votes: 2


More to the point, why would anyone spend time or money on trying to fool search engines into thinking that bad content is good?


That's not the risk. The risk is that good information gets a false-positive "not a fact" label, and is suppressed, at a grand scale.
4:33 pm on Mar 3, 2015 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 30, 2002
posts:4998
votes: 47


I haven't read all the details (came across it on NewScientist a few days ago). My first impression is it shows how their text parsing algos must be advanced enough to differentiate statements of fact against opinion. Bookmarked it for reading later.

The filter bubble idea is interesting. Seems like an evolving social graph alongside the knowledge graph would help us find our frame of reference. It'd be good to know if I've been thinking too much like person X lately and allow me to choose to shift my frame of reference to person Y.
4:42 pm on Mar 3, 2015 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:May 24, 2012
posts:648
votes: 2


I'll just leave this here...a good example of the current state of the "Knowledge Graph", as well as G's ideas of the best sites to have in the #1 and #2 organic spots.

This is *after* they've hired a boatload of medical professionals to clean up the health SERPS and add better info.

[imgur.com...]

Are there examples where they have done a good job? Sure. But, if this is your starting point, it seems a bit early to put your trust into some machine learning fact finder.
4:55 pm on Mar 3, 2015 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1102
votes: 122


1. How will it deal with a page that discusses incorrect explanations of a phenomenon as part of a correct and fully-informed overview?

2. What determines the definitive version of accuracy?

3. How does it evaluate irony?
6:39 pm on Mar 3, 2015 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:June 19, 2005
posts: 369
votes: 18


Yeah I see there are far to many contextual problems for this to be of use any time soon
7:15 pm on Mar 3, 2015 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Sept 16, 2009
posts:1085
votes: 80


I read an interesting study recently (maybe here) about what constitutes a 'good' article headline in terms of SEO and CTR. The consensus was KISS: no 'cliff-hangers'; just sum up your article. Don't ask a question. Don't play devil's advocate.

That would seem to be appropriate here. Most people don't get your pun or non-sequitur, and Google probably won't either.

How does it evaluate irony?

How do idiots? They don't. You have to pitch low com denom
7:50 pm on Mar 3, 2015 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1102
votes: 122


Don't play devil's advocate.


I'm not suggesting playing devil's advocate, but some explanations are incomplete without dealing with popular misconceptions.

I think the article itself betrays the belief that statistical models arrive at the truth, when in fact all they do is quantify our uncertainty. Look ath the various headings and sub-headings: "Estimating the truth using a single-layer model", "Estimating the latent variables", "Estimating extraction correctness"; while under "Source Quality" we find: "we estimate the accuracy of a source by computing the average probability of its provided values being true" (not, we should note, by the fact that it is definitively correct or an international standard).

It is all about estimating and computing probability, but if aeroplanes were 95% likely not to crash none of us would get on one.

Don't expect results to improve or mankind's knowledge to advance if this takes off.
8:44 pm on Mar 3, 2015 (gmt 0)

New User

joined:Feb 15, 2015
posts:22
votes: 0


In case you didnt get what this means, i will impart a direct explanation:

Groupthink.

'trustworthy' means establishment.

and 'commonly accepted' means groupthink.

indeed that would suppress any contrarian, uncomfortable truth.

let me give a blunt example:

if such a method was implemented circa 2003, all google searches would bring up resources which would be yelling that iraq had weapons of mass destruction.

----------------------

groupthink, crowdsourced moderation concepts work very well to an extent, but after a certain point it becomes an enforcement of groupthink.

as can be seen in online communities like reddit, slashdot respectively. dissident or contrarian voices are eventually suppressed/buried.
9:38 pm on Mar 3, 2015 (gmt 0)

Senior Member from FR 

WebmasterWorld Senior Member leosghost is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Feb 15, 2004
posts:7139
votes: 412


For "Dissent" there is always "adwords"..
2:49 am on Mar 4, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 2, 2014
posts:683
votes: 344


the expansion of Google's Knowledge Graph into other sectors

Those who think products will be spared from Google assimilation are not looking far enough into the future and planning for the worst. There are many "factual" aspects to a product (its weight, color, dimensions, etc.) Since Google knows no boundaries, we would be best to patent anything we create that is patentable or can claim copyright to. Since Google is so deeply embedded into Washington politics, and has a number of previous employees in high level government posts, I don't see any legislation or even a desire by politicians and regulators to maintain a fair marketplace. We must take this action upon ourselves by protecting what it is that we create.
2:57 am on Mar 4, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3421
votes: 747


Two things to keep in mind:

1) It's just a paper.

2) Even if the concepts described in the paper were applied to search, there's no reason to assume they'd be applied to all searches.
11:50 am on Mar 4, 2015 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:26109
votes: 943


It may well just be a paper, but it's a proven concept. I see no reason why proven concepts cannot be tested and rolled out.
12:28 pm on Mar 4, 2015 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Sept 12, 2014
posts:384
votes: 68


What happens if the trustwothiness filter reveals the untrustworthiness of the big brands? Multiple research papers have shown the brands in my vertical have 35-40% wrong info.
4:17 pm on Mar 4, 2015 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:26109
votes: 943


@toidi That's the problem with the Internet in general. It's too easy to spread misinformation
The Borg Misinformation Collective [webmasterworld.com]

Establishing the efficacy of the facts on a web page can only be proven through expertise and knowledge of the true facts. Defining the true facts is the key, and that does not necessarily include a brand as the source.

With the correct checks and balances, the true facts can be found. It comes back to that point about trust, and the suggestion that trustrank might just be the topic we should focus on.

None of this eliminates the other factors in a web page ranking, it just adds another factor.
8:43 pm on Mar 4, 2015 (gmt 0)

New User

joined:Feb 15, 2015
posts:22
votes: 0


".......... can only be proven through expertise and knowledge of the true facts........."

that is how a lot of information was kept away from public by the establishment. ranging from truth about vietnam war to agent orange, from operation mockingbird to risks of smoking.

once you give the custodianship of truth to 'experts', the private interests and the establishment immediately starts 'shaping' public opinion.
11:16 am on Mar 5, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2004
posts:1987
votes: 73


@Engine
With all due respect: .. Remember, Google wants what the user needs, not the webmaster ranking number 1 in the SERPs.

My opinion is That would be incorrect Sir. That is plain coolaid.

There are a lot of interested theories posted here/everywhere on calculating the user' intent when they search. None of that scales up with what Google wants to earn, or does earn at this point. It does not make sense for this commercial enterprise to give actual facts when providing/displaying fractal(https://www.google.com/search?q=fractal) information makes so much profit.

I respect that scientist who is working at Google, and that one and the other one too. But at the same time realize that the knowledge(call it as it is) of masses had been shaped by a humanoid with a bigger stick, bigger house of worship, bigger pocket for a while. Now Goog(just because we are conversing on this subject) shapes it by a bigger data set. Personalization of search results. Really? How is that fair?

3 Ads, 3 sentences that came from heavens know where, 8 big brand names SERP with more Ads on the right and a fancy button, all over the screen. Am I suppose to shape my opinion on life based on that?

Knowledge is a very big word. Throwing a spin on it is endless.
1:01 pm on Mar 5, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3593
votes: 340


Maybe Google has realized that enormous amounts of money are being spent to flood the web with lies and misinformation.
3:04 pm on Mar 5, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Feb 3, 2014
posts:1358
votes: 464


How can we trust Google's judgement of trustworthiness when we no longer trust Google?
3:25 pm on Mar 5, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2004
posts:1987
votes: 73


^^^ too late for that, they shot themselves in a foot when they let MFAs in to the doors.

I search for a few sentences, in quotes, that I wrote 12 years ago, which were my thoughts based on reading 9 books, actual fact books written by Uni Professors that have spent their lives researching the subject.

Result: 67 sites, scraped content, mixed/merged with other scraped sites quotes/fluff in the same paragraphs that have nothing to do on the subject, just similar words. And these are the ones that are still in the index today. The list changes on monthly bases.

This month Goog Added .GURU domain in the list(HA-HA). When searched on the main subject there is snippet that is a lifted from a site that has been on a web for about 6 years, ecom site, went down under 4 years ago, garbage promotional fluff.

The subject is specific area of Paleontology. The EDU professor's site(hosted on .edu, lady is live and kicking) is nowhere to be found.

Knowledge my... ecom site.
4:14 pm on Mar 5, 2015 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:26109
votes: 943


Taking tin foil hats off, and just getting back to the topic...

TrustRank - the establishing and the measurement of trusted sources to provide ranking signals.
How would Google go about deciding if a page, or site, is a trusted source? Would a .edu be a trusted source? Would Wikipedia be a trusted source? How would that be verified?
That seems to me to be the real challenge.
8:28 pm on Mar 5, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2004
posts:1987
votes: 73


I don't have an answer to those questions.

All I know is that 4 EDUs(real authority sites) pages link to my ecom site section with situation where an additional info could be found, cause I did an additional research. Meanwhile Amazon, HSN, Overstock, Facebook, Ebay, Pinterest & Etsy(which filed for IPO yesterday BTW, we all know what that means and where that is going) outrank me and dozen other sites on the topic. I am talking about the topic of widgets, not the widgets themselves.

I can't put 2 words together in this situation. Trust and Rank, cause there is a $ in between them according to Goog SERP.
2:55 pm on Mar 6, 2015 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:26109
votes: 943


I would have thought that the weighting of links vs authority is going to take a swing towards authority. How that would be done is the issue. Clearly, .edu has, on the most part, been respected, and is likely to be a good start.

Yes, there is the issue of money involved, but it's the ads, primarily, that generate the revenue for Google, so i can see that remaining separate.

blend27, are you suggesting sponsorship, or big business influencing the SERPs by paying money? Who are they paying? Are you suggesting those businesses are paying through their AdWords and would have influence over the SERPs? Or something else?

Trust, for a site, has to be won, and is easily destroyed.
3:11 pm on Mar 6, 2015 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:May 24, 2012
posts:648
votes: 2


Taking tin foil hats off, and just getting back to the topic...

Are you suggesting those businesses are paying through their AdWords and would have influence over the SERPs?

If you really wanted to steer the topic the direction you asked for in the first quote, why then immediately start baiting with provocative questions?

We can all guess why things are the way they are, but none of us really know. My guess is that brand domination is there because it's the lazy way to reduce the impact of spam. Except they missed a few loopholes. For example, there's tons of spam in the SERPS from facebook now. It's a trusted, high authority domain, but anyone can post garbage there.
3:28 pm on Mar 6, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member editorialguy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 28, 2013
posts:3421
votes: 747


I would have thought that the weighting of links vs authority is going to take a swing towards authority.


That's been happening for some time, to judge from the results that I see for informational queries. A couple of years ago, it seemed that megasites like Wikipedia and TripAdvisor (which are "authorities" only for the broadest of topics) dominated Google's top rankings. Today, specialty sites are doing much better: For the queries that I watch, the megasites often rank behind smaller sites (including mom-and-pop sites) that have expert, in-depth content on specific topics.

Matt Cutts predicted this was coming back in April of 2014. Here's a link to a Search Engine Watch article that includes a Matt Cutts video and includes quotes for those who don't have the time or inclination to watch the video:

[searchenginewatch.com...]
6:00 pm on Mar 6, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3593
votes: 340


Don't forget that Google has been interested in authority and trust for a long time. It was one of the main points they talked about when they launched their Knols. Later they also talked a lot about it when they created their Author's Tag.
This 31 message thread spans 2 pages: 31
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members