Forum Moderators: open
More details and links in an article [webmasterworld.com] in the European forum.
I couldn't agree more. Working with danish websites in a country were we are only 5 mio. people and much less online, one really has to bring up unique content. Many smaller companies can't afford to sell their products to other users than Danes. The trouble of collecting payment and shipping to other countries is too big. The market is simply too small here - yet.
I took Lex4 for a brief spin and it seems to be a completely new concept in searching(?) I noticed that you are using Altavista technology, but couldn't quite understand what it does. It seems that Lex4 explains how a topic relates to other topics in the language of your choice, but does it take you to the information itself? Could you please explain this a bit more?
Brett: I have been pondering your interesting comment for a day and yes - in the absence of a 'lingua franca' there is indeed a very steep obstacle to surmount. Will there ever be a common language? I think not. When governments percieve a real threat to their local language - as a.o. the French and the Spanish already have - they will take steps to protect it. The reason is simple. If the English language was to become universal, citizens of other countries would lose contact with their nations' cultural history. And that would be the beginning of the end for the national state.
Imagine if you will that English would be forgotten and Mandarin adopted in its place. (A possible scenario in 100 years.) The Anglosaxon world would then no longer be able to read Shakespeare in the original language. To mention but one of the dire consequences. I can read neither the inscriptions on our runic stones nor the Edda tales, which were completely understandable to my ancestors less than 40 generations ago. There were no national states then, so no one protected them. (The Icelanders can read the Edda, because their language has remained relatively free from foreign influence during the last millenium.)
The solution is likely to be sophisticated and intelligent developments of Babelfish. Or perhaps Lex4 is the first step towards a comprehensive solution.
Discussion advanced quite a bit since yesterday and I'll try to provide some more information -- when submitting my last comment I was first tempted to "tout our horn" a little bit louder but then backed out in order not to appear to make an unrequested promotional posting... (Even though Lex4 is more of a showcase for our technology than a site we've got a direct commercial interest in.)
We are currently thinking about launching a new corporate site for SERUBA [seruba.com] in order to integrate the information we provided rather ad-hoc on "Published Subject Indicators" at [psi.seruba.com ]. Until then, however, you can get some more detailed information from the "Principles of the SERUBA ontology", a pdf-file you can download at the latter URL.
Lex4 is based on what we call the "Lexicosaurus" (a combination of lexicon and thesaurus), i.e. on a semantic network in which the topics and languages are linked. This data base is being edited by our multi-lingual teams; its current focus is to capture and reflect the topics of everyday communication. Each topic is represented in each of the four languages, according to its actual meaning -- that's how the query can be transformed in a meaningful way from one language to the other. (And that's how more languages can be added easily.)
Lex4 does several things:
1) Many people have a rather vague idea of what they are searching for when they embark on their search. In this case Lex4 not just releases a general query on an index but, by showing people the wider context that might be of interest to them, allows them to narrow down their query in an intuitive manner (i.e. by clicking through the related topics until they cannot find a topic that better reflects their interest).
2) Then Lex4 generates the search string for its users, i.e. the user does not need to know the (complicated) syntax of SEs or Boolean operators; at this stage Lex4 also adds possible alternative expressions and synonyms for the topic in question. (And currently we are working on a better expansion, as sometimes the expressions are not flexible enough when used as phrases. --> "it's underlying logic of related words and stemming")
3) Finally, Lex4 encourages "cross-language information retrieval" -- the feature for which I first pointed it out in this discussion, and which is accomplished through its meaning-based approach. Given that many people know some (school) English but are not confident to use it actively and that, at the same time, most documents on the web are still written in English, Lex4 provides a competitive edge for those who are at least capable of reading and understanding foreign languages. (And some of the pertinent synonyms that our Lexicosaurus contains, a non-native speaker would not come up with.)
Therefore, Lex4 is not a SE as such, but rather a search & information retrieval tool (or service or aid) that helps its users to structure and formulate a query and to "translate" it in a meaningful way into other languages.
In order to give an idea of the potential of its technology, we had to apply the Lexicosaurus on a web index, and for that we chose AltaVista. Therefore, "does it take you to the information itself?" Not really, it's actual service ends with sending the query to a web index: it helps you to formulate more precisely what you want; the results are what you get when you apply the query to the AV index.
That's enough for the moment, I suppose. However, I am quite sure our Managing Directors will be pleased to provide you with more details. Especially they could go into more details as to the technology we develop/use.
"Combined with a really good translator..."
Well, here a headline of the Webmaster SE News [searchengineworld.com] springs to my mind: Web Portal Translates English to Malay [news.excite.com]...
As to humans, yes indeed, it doesn't work without them. We've got several multilingual teams with native speakers of all the four languages from (almost) all around the globe (not just Europe and the States but also Africa and Latin America), in order to realise not only a cross-language but also a cross-culture approach to the topics we enter in our data base. Thus we strive to make sure that our topics are not centred on one language or on one culture only (and, this way, our data base is expandable to include other languages).