Google's Amit Singhal Introduces Knowledge Graph

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's Amit Singhal Introduces Knowledge Graph

engine

5:38 pm on May 16, 2012 (gmt 0)

Google's Amit Singhal Introduces Knowledge Graph [googleblog.blogspot.co.uk]

Search is a lot about discovery—the basic human need to learn and broaden your horizons. But searching still requires a lot of hard work by you, the user. So today I’m really excited to launch the Knowledge Graph, which will help you discover new information quickly and easily.

The Knowledge Graph enables you to search for things, people or places that Google knows about—landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more—and instantly get information that’s relevant to your query. This is a critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.

We’ve begun to gradually roll out this view of the Knowledge Graph to U.S. English users.

incrediBILL

7:34 am on May 17, 2012 (gmt 0)

ps: I sympathise with those wikipedia authors who make selfless voluntary contributions,

<rant>
Yeah, I really sympathize with those that have argued with me about first hand information I have from working at some of the companies or on some the products documented in Wikipedia where I tried to correct misinformation and got shot down when I was actually there!

It's particularly frustrating when someone tells me I'm wrong and I was the designer and/or development manager of the product. Fine, distort history, lie like a rug, I don't care as I can document the truth elsewhere and outrank their page of lies.
</rant>

That's why I don't have much faith in Google's new strategy either as perception tends to outrank truth and the masses just happily bleat and follow the herd.

Rockyou

7:41 am on May 17, 2012 (gmt 0)

@indyank, Rule no.1 Don't follow what Google says.
Rule no.2 Stick to rule no.1 always.

Show me one commerical website who doesn't game Google one way or the other. Everyone wants traffic or Prove me that Google doesn't Game you for getting those advertising Dollars. So Gaming is natural. Accept it. Let better brains win the battle.

zeus

9:00 am on May 17, 2012 (gmt 0)

Marshall - that was also my thinking, how will this hurt us. Some years ago when google came with news, the thinking hmmm maybe a interesting news, now one gets scared and the time where more and more webmasters block google totally comes closer.

The day will come, could even be this year, where webmasters will come together, must be over 1 mill site owners, which then make up date where they block google for 3-7 days in protest.

Harry

9:16 am on May 17, 2012 (gmt 0)

And of course, the fact that you specifically need to enable the semantic Web in order for Google to "scrap" your contents in the first place continues to be ignored posts after posts by people preferring to panic and worry about something that is not happening.

Again, no semantics on your site, no scrapping. Bots can't understand the information on your site unless you add the meta data for them to understand what your contents is about. Don't add the semantics, and Google will not borrow information from you.

I'm disappointed at how people here, who should know better are acting just like Joe Public and perpetuating misinformation...

zeus

9:18 am on May 17, 2012 (gmt 0)

Harry i did not know have not seen that here

Harry

9:30 am on May 17, 2012 (gmt 0)

@zeus Google bots cannot understand what your information and keywords are unless you give them the semantics to understand what your contents is about.

If you're writing about a lake you visited in Northern Minnesota, you need to put, for example, tags that says your article is about a location, in the state of Minnesota. Humans understand the data when they see it. Bots do not. It's just a bunch of words. If you don't tell them that lake XYZ is a #lake:freshwater in #county:XYZ #state:Minnesota, #country:USA, #latitude:XYZ #longitude#XYZ, discovered by #Person:XYZ in the #year:1837 and named after this other #Person:XYZ Google bots will not understand.

This is not how the markup to is written, but I'm trying to make a point. You need to specifically put these tags in your contents for Google to understand and then use the information. Don't put the tags, and Google will not borrow from you.

nickreynolds

9:41 am on May 17, 2012 (gmt 0)

I looked at those screenshots. You know what, disregarding for a moment Google's motives and their modus operandi re scraped content, I think people will love it! OK I'm not sure where the ads will go and it moves away from being a search engine page, but I think lots of people will like it. They search "Taj Mahal" and that one page becomes the hub for all their subsequent browsing on the subject.
Plus, it's clever but you see how Google are introducing more and more graphics into search now - pictures of authors, screenshots of video, mini maps, stars for ratings etc. They know images will hold people on the page much longer and cause them to come back.
Personally I think it's going to work for them.

davedm

10:11 am on May 17, 2012 (gmt 0)

Regardless of if this is good for users or not; this is bad news for publishers, content creators and webmasters.... and ultimately that is bad news for anyone who uses the internet to learn because the sources of expertise will dry up as Google puts them out of business.

Google continues to suck data off our sites (if you are marking your data up with schema.org you are making it machine readable and Google WILL take it and use it on their own properties [ "“Wherever we can get our hands on structured data, we add it,” - Amit Singhal ].

Google needs to learn that just because something is "so cool", technically possible and even good for users it cannot ignore moral/legal issues of stealing other people's content.

It also annoys me this is being labelled as "knowledge" it is not, it is re-assembled data - knowledge is something held by experts on a subject matter; not something you can reproduce with a machine.

anand84

10:17 am on May 17, 2012 (gmt 0)

This is not how the markup to is written, but I'm trying to make a point. You need to specifically put these tags in your contents for Google to understand and then use the information. Don't put the tags, and Google will not borrow from you.

In the current scape, all things being the same, Google promises to give more prominence to the original content provider that has the maximum trust. But in the new knowledge graph scenario, you may be the most trusted site in your domain. But even if one smaller site with lower organic links than you (and possibly who copies from you) has their meta data available, Google should be able to showcase knowledge graph. Ultimately, the point is, unless all webmasters magically unite to remove meta data (which is next to impossible), the impact of this on the websites does not reduce no matter you enable tags or not.

Angonasec

10:56 am on May 17, 2012 (gmt 0)

Yet another irritating Google "innovation" to be block from accessing our site.

We meta nocached them years ago, blocked imagebot, Preview, prefetch, and Google Producer etc... from their birth, and now this vile creation.

It will come as a relief when we can finally block all access to G because of their constant abuse. That point is not far away.

nomis5

11:43 am on May 17, 2012 (gmt 0)

As far as tags are concerned, just use your head a little. Having the tags gives you pics and stars in the serps - nice.

Accidentally leaving a few crucial tags out makes the whole article useles to a data harvester. But the joy is that the data harvester doesn't realsie that and includes the data as valid data.

If enough people do that type of thing the eventual database will be useless. And even if it's just you doing it, at least you have given nothing useful away.

rlange

2:02 pm on May 17, 2012 (gmt 0)

I have to say... the paranoia in this thread is damn near oppressive. If I can make an honest request, it would be: If the only thing you have to contribute is venom and the-end-is-nigh style ravings, spare us. I can maybe understand the frustration this unknown future causes, but please save the virtual tears and fist-pounding for your own pillow.

I don't know how many of you watched Star Trek, but of those who did, how many honestly gave one second's thought to the issue of sources and attribution when the computer came back with an answer to a question?

mcreedy wrote:
so google is telling us if you search for Taj Mahal it will know if you were looking for the local restaurant or the place in India.

I don't think so. The Knowledge Graph (KG) doesn't sound like a step forward in deciphering user intent. It just "knows" that there are multiple, distinct entities called [taj mahal] and "knows" the relationship of data associated with each of those entities.

With the Matt Groening example, a normal search for [matt groening's siblings] would ultimately lead you to the information, but that's only because those words or their synonyms appear on a particular page. With Google's KG, it "knows" that the Matt Groening entity has a sibling named Lisa.

I don't know that this, itself, significantly changes search from the user's perspective, but on the backend it's a huge thing. On the frontend, it could just provide interesting data-at-a-glance, but, then again, it could also provide much more focused search results.

incrediBILL wrote:
Just watch the Tonight shows recurring "Jay Walking" skit and you'll see what I mean.

I'm a huge cynic, but that skit is highly deceptive. It's the inverse of an inspection line at a factory; they discard all the good answers and package the defective ones. It's entertainment, not a survey.

jmccormac wrote:
Google Knowledge Graph = Yahoo 1990s Portal 2.0

I'm unclear as to what led you to this conclusion.

g1smd wrote:
Since Google never "forgets" anything, misinformation will accumulate quicker than truth.

Indeed, mis-information will become truth, "because Google says so".

And this is different from the current situation... how?

I'd say it's at least a bit less likely with the KG (for certain things, anyway). For instance, Google is pulling data from the CIA World Factbook, which, as I understand it, is generally accepted to be a very accurate collection of data.

Harry wrote:
And of course, the fact that you specifically need to enable the semantic Web in order for Google to "scrap" your contents in the first place continues to be ignored posts after posts by people preferring to panic and worry about something that is not happening.

For some, this thread seems to be their own, personal Two Minutes Hate [en.wikipedia.org].

--
Ryan

incrediBILL

2:31 pm on May 17, 2012 (gmt 0)

FYI, Taj Mahal is also a brand of beer and the name of a musician too.

I'm a huge cynic, but that skit is highly deceptive. It's the inverse of an inspection line at a factory; they discard all the good answers and package the defective
ones. It's entertainment, not a survey.

@rlange I agree they edit it for entertainment purposes, but you obviously don't talk to enough average people about what you and I would consider general knowledge. For instance, I heard some really wacky stuff coming out of people's mouths at a picnic last weekend and they thought they were right. I'm hoping Google doesn't think they're right too or we're all going to take a wild ride buried in misinformation.

londrum

2:42 pm on May 17, 2012 (gmt 0)

the fact that you specifically need to enable the semantic Web in order for Google to "scrape" your contents in the first place continues to be ignored posts after posts by people preferring to panic and worry about something that is not happening.

Again, no semantics on your site, no scrapping.

that might be literally true, but it doesn't take into account that other people will likely provide the information instead.

eg. if your site contains a biography of Matt Groening (to use the example from Google's blog) you'd want to prevent google from including the information on the first page of the SERPs. because it will probably dent your traffic. but its a popular subject. he's a famous guy. other people are bound to provide the same information on their sites. so google is going to get hold of the information anyway -- regardless of what you choose to do on your site.

the only way that webmasters can stop google getting hold of the information is if they ALL block them -- every single one. which is obviously never going to happen.

Rasputin

2:46 pm on May 17, 2012 (gmt 0)

@rlange paranoia - perhaps, time will tell.

But the fact remains that if any of us here scraped basic data from loads of other sites together and called it a 100 million page website - then asked on the google forums why we didn't rank at the top of page 1 in google for pretty much everything, we would be a laughing stock: we would be told we didn't meet any of the 'quality guidelines' issued last year (do you offer an alternative point of view or offer something new etc), have no quality incoming links, have no right to expect to make income from other people's work, should expect DMCA filings against us etc.

So when copyright work is used by someone else to do exactly that just because they are in a position to take advantage you can see why some people become paranoid...or at least a bit cynical.

Harry

3:03 pm on May 17, 2012 (gmt 0)

@Rasputin, you can't copyright facts like the depth of a lake, the date a movie comes out in theaters. If that's your business model, change it fast.

Wikipedia probably does far more scrapping than the pure semantic Web suggested by Google will. In Wiki, opinions are copied, debates about topics are included. The Knowledge Graph is based on simple facts and data, not interpretation of information or opinions, like the human-scrapped Wikis that take from the best experts in a field and play editors when as @incrediBILL wrote, they interpret facts based on internal politics.

The Web was always intended to be semantic. All Google is doing is what Yahoo should have done years ago and push Web 3.0. Facebook and Amazon are also doing it. You can bet that Microsoft is working on that too.

Just because as a webmaster/publisher, you got used to one business model for about 10-15 years during a transitory phase of the Web, it doesn't mean that the entire Web adventure and its logical next step should cease to exist to suit your particular needs.

londrum

3:14 pm on May 17, 2012 (gmt 0)

you can't copyright facts like the depth of a lake, the date a movie comes out in theaters

actually, you can, because i fell foul of that once. you are not allowed to reproduce the football fixture list in the UK without paying a fee. the date of each game and who's playing is under copyright. (it sounds dumb, but its true!)

smithaa02

3:40 pm on May 17, 2012 (gmt 0)

Think it adds screen clutter...where will they put it? Right now that competes with ads...don't think the higher-ups will like that.

indyank

3:51 pm on May 17, 2012 (gmt 0)

Think it adds screen clutter...where will they put it? Right now that competes with ads...don't think the higher-ups will like that.

it is sure going to increase their ad clicks even if they place them below this so called KG. It would take the focus of users away from the main SERP results and they would then automatically notice the ads below the KG.

indyank

3:59 pm on May 17, 2012 (gmt 0)

The Web was always intended to be semantic

Add to it this as well "The Web was always intended to be social" Anything that suits your convenience, isn't it? As long as you are allowed to be monopolistic, you can can keep saying and doing anything you want.

mhansen

4:31 pm on May 17, 2012 (gmt 0)

It will come as a relief when we can finally block all access to G because of their constant abuse. That point is not far away.

Just ask the book publishers how that works... When enough people block Googlebot to the point it starts effecting their depth of knowledge on the web, they will simply ignore the directive.

Google will index our content whether we like or not, based on the value to their users and the overall importance of knowledge to being shared.

Again, just ask book publishers and copyright holders. nom nom... "All Your Content Are Belong to Us" nom nom...

[en.wikipedia.org...]

netmeg

4:46 pm on May 17, 2012 (gmt 0)

Well I haven't even SEEN it yet myself, so I will reserve judgement.

randle

5:54 pm on May 17, 2012 (gmt 0)

How do we know which facts are most likely to be needed for each item? For that, we go back to our users and study in aggregate what they’ve been asking Google about each item.

Will it improve the user experience? For Joe surfer it probably will.

But at the end of the day, at its core, its Google's (and Facebook and all the rest) continued efforts at increasing their ability to taylor the surrounding ads your exposed to, via compiled data, which increases click through rates and ultimately revenue.

graeme_p

6:21 pm on May 17, 2012 (gmt 0)

you are not allowed to reproduce the football fixture list in the UK without paying a fee. the date of each game and who's playing is under copyright. (it sounds dumb, but its true!)

It is stupid - but it might be the sort of thing that trips Google up over this.

DuckDuckGo (and I think Bing) has experimented with Wikipedia snippets and similar on the SERPS, as has Google before. Only DuckDuckGo seems to have made it a major feature.

I am not too worried because GOogle has now buried my site so far down the page I only get visitors searching for more in depth or better written stuff than the pages further up the SERPS anyway.

gmb21

6:45 pm on May 17, 2012 (gmt 0)

Oh dear! I just searched for the name of the country I live in, and the image of the country's flag (in the right-hand "knowledge" column) links to a site of ill repute!

backdraft7

7:19 pm on May 17, 2012 (gmt 0)

The only opportunity I see here is to get into the macaroni & cheese sales business, because a LOT of webmasters are gonna be eating it on a daily basis from now on.

Sgt_Kickaxe

7:42 pm on May 17, 2012 (gmt 0)

Interesting - When looking up Taj Mahal from Canada right now the template is the same as in the U.S. however the knowledge graph mashup is not there, there is nothing in the sidebar. I find it interesting because there are no ads being displayed instead. Not in the sidebar or even above result #1.

Is the world about to learn that you cannot monetize some things? Or is Google simply dropping the ball on their ad display code? Or ... ?

dstiles

7:43 pm on May 17, 2012 (gmt 0)

mhansen
> Google will index our content whether we like or not,

It is very easy to block google from reading your pages completely. They may come in on a non-google broadband IP with a "real" browser but they are not going to scrape much that way - not from millions of sites and billions of pages. And as soon as they are detected - not difficult - their IP can also be blocked, along with the whole /8 if necessary.

If webmasters do not know how to do that then it's not surprising there are so many complaints of blackhat site scraping.

gehrlekrona

10:46 pm on May 17, 2012 (gmt 0)

@backdraft7, funny but true. I don't know if this "upgrade" have anything to do with all the mess in other threads, but I am guessing that everything is connected over at the Plex.
They have been working on this and calling it Panda and Penguin just to misinform people ao we will try to understand what is going on, which we of course, never will.
The reason I went to Google in the first place was their "clean" home page. There was a place where you searched for things and then you got the result. All other search engines had news and God knows what on the home page and it was all cluttered. Google had nothing and it loaded fast. Now they are going with more crap on their site, maybe not the first page, but on other pages. Some people might like it, most likely the ones who likes GOOG's suggestions because they are lazy(?) enough not think think themselves but get served with suggestions. So is "Demagoog" the nice one?

jmccormac

4:38 am on May 18, 2012 (gmt 0)

Guess that Google is getting very scared of Facebook and is trying hard to take the wind out of its sails with this "knowledge graph" guff. After all, the "social graph" concept is closely tied to Facebook and with Facebook's launch, Google is yesterday's news as far as the news cycle for the next month is concerned.

Regards...jmcc

This 116 message thread spans 4 pages: 116