Forum Moderators: open
I read about a small company named Meaningful Machines that may provide, if not a key into the future of search, a keyhole from which we can glimpse the future of search.
The direction they are taking is creating computers "that understand single and multi-word concepts – in any language and across all domains – and identifies relationships between concepts."
From the Meaningful Machines [meaningfulmachines.com] website:
"Meaningful Machines is currently laying the groundwork for the development of technologies that can be applied to the fields of artificial intelligence and natural language interfaces.The technology presented... was an automated system, called the “knowledge engine,” that understands single and multi-word concepts – in any language and across all domains – and identifies relationships between concepts."
From The Washington Post [washingtonpost.com],
...claims to have unlocked the mystery of "context" in human language with a series of algorithms that enable computers to decipher the meaning of sentences..."This man literally has figured out the way the brain learns things..." It aims to power other products brought to you -- and corporate America -- by such companies as IBM or maybe even Google.
Anybody who's tried to improve their internal search will tell you this stuff is really hard. And you end up with a million rules based on languages, context, etc.
My theory is that someone will figure out how to teach a computer like you teach a child. And it will take 22 years of education to get it to respond to a search correctly. And another 10 to add subtelty and shades of meaning.
However, lots of piecemeal work by linguists certainly could map big chunks of phrases to words and so on. But that would be more like brute force, not true intelligence. And in fact, I believe that's what the current state of linguistic technology is like. If these companies actually have a "learning" algoritm for language, now that would be something.
And you end up with a million rules based on languages, context, etc.
You're absolutely correct. But if you take a moment to follow the links you will read that this technology is not rules based.
That's why I posted this.
Whether these people exploit this, Google beats them to the finish line or Google licenses or buys them ala Applied Semantics, this is a peak into the future.
They already have businesses exploiting part of the technology for the purposes of translations- in fact, this grew out of language translation work.
Typically, it shortens the amount of code necessary to complete a task by a huge margin - and one could argue that fuzzy logic is NOT rules based, because "rules" implies yes / no & binary thinking or programming.
There is a chap that did a neural network model of human cognition back in the 70's & during the 90's started working with the USA militay on a number of projects.
Imagination-Engines.com - fascinating stuff. If they ever hop into the search engine game with their creativity engine, I am 100% sure they could come up with something very snazzy indeed - which would do as this chap in your article, martinibuster, implies - actually understand stuff.
To achive understanding, you have to be able to recognize concepts - color, for example, is a concept - not a rule. When you show a kid "blue" or "red" you are teaching them about fuzzy logic, and not true or false, yes or no, because we say lots of colors are "blue" yet - they aren't always the same.
So you end up with a range of colors the equal blue, or a group of people that are all "tall" yet - they might vary a few inches in height, comparing one to the next.
Language is similar - the bloke in the article could have been less secretive, and tossed out fuzzy logic, though some would then label him a crack pot. :) At least, it would give his thinking more weight, in my view.
Another is a statistical system that makes word-to-word comparisons in previously translated text and then consults the matches later to calculate probable meanings when it encounters each word again in untranslated text.Abir's approach involves a variation of the second method.
I don't think that this approach will pass the AI test in the near future but I'd like to see more research along these lines. Right now it looks like Eli Abir has applied for a few patents, posited a theory and hopes to be able to make his ideas work. As for grasping entire concepts, I don't see it happening any time soon, especially across cultural lines. Humans haven't proven themselves adept at that yet...
However, I have to say that I was mighty intrigued with the concepts these guys are throwing around, and can't help but feel that AI is one of the improvements in store for the future of search.
Peter Norvig [norvig.com], the Director of Search Quality at Google, is a "Fellow and Councilor of the American Association for Artificial Intelligence" and he co-authored a textbook on AI, entitled, Artificial Intelligence: A Modern Approach.
Looking to the future, I wonder if AI would be implemented in some form to complement or partially replace the current algo function for ranking sites. That it wouldn't look for simple sets and repetitions in a web site but would look for actual relationships between words and accurately extract meaning from the context.
The adsense engine currently accomplishes this pretty well. I was reading the Applied Semantic whitepapers today, which detailed in general their aims and methodology for identifying context and afterward pondered some adsense ads I was exposed to this morning.
The context was within an email service that shows banner ads to pay the bills. The adsense ads that accompanied this? They were for Banner Ad Rotation Software!
Wow. The Applied Semantics software correctly identified the environment in which it was being displayed. Which is spooky.
OTHOH, I think that AI technology may have a limited role in trying to figure out what surfers are trying to find from the front end where the fingers hit the keyboard.
I'd be interested in chatting with anybody into that field, as I've developed toolkits for research for over 8 years now, and I believe it has potential to become a very mature technology in the near future, if the right kind of people focus on it.
SN
Just today, I've read about the computer scientist Franz Josef Och from the University of Southern California, who developed machine translation software, which is not "rule based". As he says in the press release [eurekalert.org]:
"Our approach uses statistical models to find the most likely translation for a given input,(...) It is quite different from the older, symbolic approaches to machine translation used in most existing commercial systems, which try to encode the grammar and the lexicon of a foreign language in a computer program that analyzes the grammatical structure of the foreign text, and then produces English based on hard rules.(...)Instead of telling the computer how to translate, we let it figure it out by itself. First, we feed the system it with a parallel corpus, that is, a collection of texts in the foreign language and their translations into English".
Whats more, with this system, that man is able to automagically create translation software between any given two languages, just by feeding the engine with enough data (gigabites of translated text)
I wonder if this idea can be implemented in the area discussed in this topic as well. By feeding the system with enough texts (or perhaps even with the associated picture or sound data), it could be possible to use similar statistical methods to measure the "meaningfulness" or "relevance" of the given text sample (for example crawled webpage).
(at least, I think he's at the same school as the other person you're mentioning).
Funny thing is, the approach that you just cited is a LOT like the "creativity machine paradigm" from Dr. Thayer of the imagination-engines.com website. Very similar, their approach to AI involves a neural network agitation 'friction' based on the system input which is similar to using two texts to create a "rule" of reference, and then extraplolation based on that singular rule.
The chaps at Google have done similar in data mining, a lot of the approaches, despite the underlying math, have similar outcomes. But, the fuzzy approaches tend to have the least number of rules, due to the nature of the underlying math.
Neat stuff - seems that perhaps AI may impact search engines in the future in some fun ways.