Well that's cute, but they didn't find MY Bacon Number (which is 2)
Hopefully someone with a Bacon calculator website won't pipe up and complain that Google is stealing their traffic. (But it wouldn't surprise me)
But how many of these associations are actually "known" in the first place? And it's getting deep into personal data to discover many of the other connections...
So this is what Google has become...a bacon calculator? I love bacon, but this explains why the current serps suck. Good to see they've spent their time improving user experience by developing another useless feature.
Isn't this actually a demonstration of how Google is responding to queries using information gathered from web sites without linking to those sites?
|Good to see they've spent their time improving user experience by developing another useless feature. |
I would warn you against dismissing this feature. Yes, the baconator doesn't have any real world applications, but little things like these provide insight into the direction that Google is moving.
Obviously, they felt that this was really cool. And if a bunch of search geeks think something is cool, they're not about to stop here.
Bacon: coming to a niche near you in 2013.
|Yes, the baconator doesn't have any real world applications, but little things like these provide insight into the direction that Google is moving. |
Really? like the fact that they use "computers" to sort their results?
Seems like just another non-search related game the engineers came up with to fill time...maybe next they'll show you all the people in the world who play "werewolf".
What I'd really love to see is a demonstration of their ability to reliably improve user experience.
|Bacon: coming to a niche near you in 2013. |
Queue Home Simpson: "Mmm... Bacon!"
|Obviously, they felt that this was really cool. |
I would say, rather, they thought this was *easy* - a place to start, play, experiment and demonstrate, a place to test an algorithm because it's relatively easy to find corroboration of results, a place to play because the universe of actors is small (relatively) and their connections few (relatively) so the computing horsepower needed is not that great.
Personally, I think the long-term effect of pursuing this - tweaked algo, more CPUs, vastly larger datasets - is that we start seeing not parlor games transferred to the net like a mere computer solitaire, but questions answered that were hitherto unanswerable or extremely difficult to answer.
I could see two possibilities.
#1 - this is like natural language processing - a problem that AI people thought they would have cracked in a couple of decades but the data set is so large, varied and unfiltered that no matter how much we throw at the problem, we remain a long ways from a computer that understands. That said, I am impressed by the progress we've made over the last thirty years and the efforts at the progress have had a significant impact on our lives.
#2 - this is like chess - a problem that chess fans thought a computer couldn't crack for a very long time, perhaps not ever, but which ultimately has simple rules and a "small" data set (massive compared to connections to Kevin Bacon, small compared to permutations in natural language). Computer solitaire with "hints" and "cheats" was easy to build, but ultimately it's just a variation of a chess routine that can beat a grand master. And while I suppose we've learned a lot about computing from this, I'm not sure how much I've seen it spill over into our lives.
It may or may not be Google that does this, but once information becomes truly linked and a computer can discover new information (i.e. separation from Bacon based on a database of movie titles and actors names, but in vastly more complex data sets), the ability to get answers to questions that could barely even be asked is huge.
For some types of questions, it changes everything and the Baconator is to this what the 200 kilopixel camera was to digital photography - a toy of little value to anyone who cares about photography... but a game changer too.
|Seems like just another non-search related game the engineers came up with to fill time... |
That's sort of how I see it but I would attribute it to having to spend the dividend money they aren't paying out to investors and funneling it into research and development -- to keep bodies looking busy and on staff in case they need them for something productive someday. Stimulous funding of sorts :)
dont google give their staff a day off every week to work on their own projects? i think i read that somewhere.
now we know what they do with it -- they waste it making baconometers
two years from now they will probably have one of these for us as well. every time we go for a job interview they'll run the baconometer and work out that we are 2 steps away from a mafia don or a famous serial killer. and that will be our job chances gone.
i've just noticed that there is actually a dedicated website about this bacon number thing -- the oracle of bacon. if you look at the credits page you'll see that it's been around since 1999 and a lot of work has gone into it -- it looks like an exercise in database crunching.
i found the year 1999 a bit hard to believe, but i've just had a look at the site on the waybackmachine internet archive place, and it was indeed up and running in the year 2000. it had a form which you could enter the name of any actor and get the bacon number up.
so when people talk about all these new google innovations and "questions being answered that were hitherto unanswerable or extremely difficult to answer" i think they should look at the original site. google havent done anything that hasnt already been around for a decade. it seems that they just basically came along and nicked the idea, lock stock and barrel.
[edited by: londrum at 8:01 pm (utc) on Sep 14, 2012]
My bacon number is 3 slices with eggs over easy, please.
This is a watershed moment- It will change my life entirely!
Trying to become an "answer engine", a la Wolfram Alpha.
What better way to beat WA than being the best-damn-bacon-number-calculator out there!
|I would say, rather, they thought this was *easy* - a place to start, play, experiment and demonstrate... |
This little game might eventually apply directly to analysis of social networks. In looking at social connectivity on the web, Google is obviously looking for the most efficient and scalable ways to determine context.. much the same as it's been doing in text search for many years, and more recently in image search. Here Google is starting loosely with the context of people who've acted in movies.
In image search, eg, it should be noted that, once context is limited, it's much easier to interpret data. See my comment on this thread (from July 2, 2012, roughly 6 weeks after the Knowledge Graph was introduced)...
Google Makes "Smarter Best Guesses" On Image Search
|...once subject/location/context is identified, image searching becomes significantly more efficient and nuanced. It becomes much more like site: search (pun intended here) than like, say, a global search of organic results. |
Note, btw, that for the "Bacon Number", a NY Magazine article [nymag.com...] reports, per an email from Google, that "Google's Bacon-number search doesn't utilize IMDb" as another Bacon Number website does, so its results are currently less complete.
I'm thinking that Google is here "excluding" IMDb as basis for a test (assuming that's actually possible), perhaps to learn how to reproduce IMDb's already existing set of data by other means.
wheres the revelation here? This is what has been killing the serps for years. rather than search for what im typing they are looking for more and more deep relationships and watering the serps down with those. This is deeply ingrained in the google psyche and has killed the youtube search too. It seems ALL about relationships now, it may be clever but its not search.
Search is about relationships. If search is not about relationships, what is it about?
|brotherhood of LAN|
Very powerful idea with lots of scope to it. Google purchased Freebase [freebase.com] a while back and I think with their (downloadable) dataset anyone can put this algorithm into action for themselves.
The basic gist of the data is a hierarchical tree of entities and attributes, a bit like Wordnet [wordnet.princeton.edu] where you could understand that Clint Eastwood & Kevin Bacon are both people, and they're both actors, but it's also going to tell you what films they've been in too.
I could see this being applied to authoritative authors of an industry. If you have a lower Bacon number relative to Brett Tabke, Danny Sullivan, or Matt Cutts, then perhaps your articles should be placed in prominent search engine result for relevant keywords... relative to the lower bacon numbers.
Good one BOL. The idea is similar to or even an advanced version of Author Rank [seobythesea.com]
|I could see this being applied to authoritative authors of an industry. |
Indeed - good stuff. I would say that type of use is sort of the next baby step extension. Something that improves search as it exists now. As the data sets get bigger and the algorithm better, I can imagine all sorts of uses.
Ultimately, mapping relationships will go way beyond documents on the web and will be the underlying technology that will make Glasses actually useful - understanding that Hi Tech and Chipotle are both restaurants, they both serve burritos, and based on GPS or even image recognition of the street signs you just passed, one is closer to where you are now.
Even that is still a simple use case though. Long term, I imagine being able to find relationships that I could not find manually because the data sets are too big, the connections too many hops, the data not structured enough. At that point, things get really interesting.
|Long term, I imagine being able to find relationships that I could not find manually because the data sets are too big, the connections too many hops, the data not structured enough. |
This also sounds like the kind of thing IBM is doing by developing Watson.
|The combination of three capabilities make Watson unique: |
Natural language processing - to help understand the complexities of unstructured data which makes up as much as 90% of the data in the world today
Hypothesis generation and evaluation - by applying advanced analytics to weight and evaluate a panel of responses based on only relevant evidence
Evidence-based learning - to improve based on outcomes to get smarter with each iteration and interaction
Let's face it - the web is minimally structured data - and structured mark-up 9such as schema.org offers) is not likely to find widespread adoption any time soon. Google needs to do more than just wait around for web authors who may never get on board.
|Search is about relationships. If search is not about relationships, what is it about? |
if i ask you, 'did you see president obama today?' i dont expect you to tell me about his dogs, kids, wife, the day his barber was first engaged and the nice people that once lived next door to him. I just want to know if you saw him, not everything related to him but is not the question i asked.
if you asked a sufficiently sophisticated search engine if so-and-so saw the president today it would perhaps research all those seemingly unrelated relationships before answering "that would be unlikely"...
why is everyone getting so excited about this, i dont understand it.
here is an example about why this is not exciting at all...
i tried "bacon number grace kelly"
and google gave me --
Grace Kelly's Bacon number is 2
Grace Kelly and Donald Sinden appeared in Mogambo
Donald Sinden and Kevin Bacon appeared in Balto
i then tried the exact same thing on the "oracle of bacon" (the website that has been up and running for more than a decade), and it gave me this --
Grace Kelly has a Bacon number of 2
Grace Kelly was in Mogambo
with Donald Sinden, [who] was in Balto with Kevin Bacon
what is new here? it is exactly the same. all that google has done is to find a way to repeat exactly the same information that was already available on a 10-year old website.
okay, so sometimes the film names are different. for example, i tried it with marilyn monroe, and whilst they both gave an answer of two, the film names were different. but that is simply because there are countless ways to link the two actors. (the oracle of bacon has worked out that there are 97450 different ways to link marilyn monroe and keven bacon in two moves.) so google has just given one of the other alternatives.
what strikes me as wierd... is that google came up with the exact same answer for grace kelly. and yet there are more than 71 thousand different ways to link those two people in two moves. how did google come up with the same one? that practically proves that they are using the same method to get there.
if i took a guess, i would say that google's cast-list database is bigger, that's all. the oracle of bacon seems to be getting their info by cross-referencing the actors in the IMDB database, whilst google is using that, plus a load more cast-lists that it got from crawling the web.
that would explain why they came up with the same answer for grace kelly.
google has got a bigger database of cast-lists. that is all that is happening here.
[edited by: londrum at 1:01 pm (utc) on Sep 16, 2012]
if the answer is attained by using a well-structured, coherent and complete data set like imdb - not so exciting (13 years later).
when it starts looking like the world described in TB-L's "weaving the web" - that's exciting...
|Hopefully someone with a Bacon calculator website won't pipe up and complain that Google is stealing their traffic. |
I know it wouldn't bother Google, they can copy whatever they like without repercussion.
Just ask facebook.
Pretty pathetic how far Google only has come when you realize they started 14 years ago.
If they want to really impress, how about trying to demonstrate user intent by divining how many pieces of bacon I'll have for breakfast this morning.
|if i ask you, 'did you see president obama today?' |
1. What phranque said.
2. Even to the question at hand - if I don't know that "today" is conceptually the same as "September 16" (i.e. if I don't understand that relationship), then the question is hard to answer. Remove the date stamp from your post and I lose that relationship and since last I checked in was yesterday, I wouldn't know whether you were asking whether I saw the president on Sept 15 or Sept 16. Because I'm really good at parsing relationships, I look at your post and realize that you're actually asking whether I saw the president on Sept 15.
In short, without understanding the relationship, I can't do the search. That's the simplest case of course, but it illustrates the importance of relationships just fine.
|same information that was already available on a 10-year old website |
1. what phranque said
2. Sure, but that's focusing on the result. I can search through an unordered list of terms right now. If a quantum computer came along, it could search through an unordered list too. So what's the big deal? There is no new data, so who cares?
But if I search now, the best I can do is linear time: O(N). With a quantum computer, I can use Grover's Algorithm [en.wikipedia.org] and perform that search in O(N^1/2).
The result is the same. The algorithm to arrive at it is not. When the first quantum computer powerful enough to run Grover's Algo comes online, it will no doubt look unimpressive compared to the best digital computer. But the method does make a difference. Both linear search and Grover algo search yield data we already have. And yet, in the long run, a system that can run the Grover algo will revolutionize search.
If all Google is doing is writing custom code that extracts the Bacon Number from IMDB, then yes, you're right. Yawn, ho hum.
If what they are doing is designing a smart system that can use a generalized algo to pull data relationships from data sets, as they claim, and the Bacon Number Calculator is just a fun test case, then this is huge.
Sure, at the test-case phase it's just yielding data we have already, but a test case has to be based on data we already have, otherwise you can't verify the results easily and make sure the system is working. But the method used to find that data matters a lot.
| This 35 message thread spans 2 pages: 35 (  2 ) > > |