Forum Moderators: open

Message Too Old, No Replies

New survey reveals European language problems.

It's worse than you thought.

         

rencke

4:15 pm on Feb 21, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



When surfing the Internet, only 15,6% of the Europeans will use their first foreign language and a mere 9,3% their second foreign language. All others will use their mother tongue, i.e. 84% or more. This is one of the startling findings in a survey presented by the European Commission this week.

More details and links in an article [webmasterworld.com] in the European forum.

Brett_Tabke

8:15 am on Mar 25, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



That is sad Jan. I know I've always felt the real power of the internet was in breaking down cultural barriers we've artificially built over the years - an end run around the governements if you will. It appears the language barrier is a very steep obstacle to surmount.

heini

2:40 pm on Mar 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, why should people not surf sites in their own language? Internet for most surfers is entertainment. The majority of europeans, the ones from the countries with large populations, get a large spectrum of entertainment in the web in their own langúages. Their must be some 100 millions plus people speaking german, spanish is even wider spread, and french is spoken by a whole lot of people too. Most people like to make it easy on themselves.
I personally like the idea of a lingua franca, which in the moment is englisch, very much. But I mostly go surfing for information, I actively search the web. And I do read englisch a lot. But for most people surfing the web is a pastime, standing in competition to watching TV. And the bigger entertainment companies try to push the whole web in the direction of channels, of melting internet and TV anyway.
Of course I see the problem for all the dot.coms and other englisch speaking sites, but imagine (rencke, you know what I´m talking about) the restrictions for the nonenglisch sites. Especially when you are doing text-orientated sites, not focused on selling, you are restricted pretty much to your own language. And you really envy the native englisch sites, that in comparison have so much more users.

Rumbas

3:00 pm on Mar 25, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



>And you really envy the native englisch sites, that in comparison have so much more users.

I couldn't agree more. Working with danish websites in a country were we are only 5 mio. people and much less online, one really has to bring up unique content. Many smaller companies can't afford to sell their products to other users than Danes. The trouble of collecting payment and shipping to other countries is too big. The market is simply too small here - yet.

astein

1:29 pm on Mar 26, 2001 (gmt 0)

10+ Year Member



A search tool that breaks down the language barrier is to be found at [lex4.com ] It offers true "cross language information retrieval" and might be of interest in the context of this discussion.

rencke

3:47 pm on Mar 26, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Herzlich willkommen zu WebmasterWorld, astein. I hope you will like our "University of Search Engine Optimization".

I took Lex4 for a brief spin and it seems to be a completely new concept in searching(?) I noticed that you are using Altavista technology, but couldn't quite understand what it does. It seems that Lex4 explains how a topic relates to other topics in the language of your choice, but does it take you to the information itself? Could you please explain this a bit more?

heini

9:34 pm on Mar 26, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello astein
fascinating project. Seems to touch some of the topics that have been discussed here lately. I´m talking of the concept of theming with it´s underlying logic of related words and stemming. Especially interesting I found the possibility of crossing the language barrier in respect to what we discussed here [webmasterworld.com]. Would love to learn more!

rencke

10:11 pm on Mar 26, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>It appears the language barrier is a very steep obstacle to surmount.

Brett: I have been pondering your interesting comment for a day and yes - in the absence of a 'lingua franca' there is indeed a very steep obstacle to surmount. Will there ever be a common language? I think not. When governments percieve a real threat to their local language - as a.o. the French and the Spanish already have - they will take steps to protect it. The reason is simple. If the English language was to become universal, citizens of other countries would lose contact with their nations' cultural history. And that would be the beginning of the end for the national state.

Imagine if you will that English would be forgotten and Mandarin adopted in its place. (A possible scenario in 100 years.) The Anglosaxon world would then no longer be able to read Shakespeare in the original language. To mention but one of the dire consequences. I can read neither the inscriptions on our runic stones nor the Edda tales, which were completely understandable to my ancestors less than 40 generations ago. There were no national states then, so no one protected them. (The Icelanders can read the Edda, because their language has remained relatively free from foreign influence during the last millenium.)

The solution is likely to be sophisticated and intelligent developments of Babelfish. Or perhaps Lex4 is the first step towards a comprehensive solution.

heini

11:58 pm on Mar 26, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A "lingua franca" is inevitably the language of the most powerful society at a given time.
The very term derives from the times, when latin was the language spoken by the romans, who were the most powerful player then.
To try and setup a language understood by all people has been a dream for many centuries. Esperanto was an experiment to create a universal language, that went without any implications of power. It failed completely. One must not forget that communication is deeply intertwined with power.
In some states of the USA there are more people speaking spanish as a first language than englisch. Still I´m pretty sure the vast majoritiy of websites in US are in englisch.
Another example: take dmoz.org. There is the general directory. Than there is world. Meaning everything apart from the USA.
Now don´t get me wrong. As I said before, I like the idea of a lingua franca very much. But language distribution is connected to distribution of power. Not necessarily connected to the concept of nations, as we know them now. But connected to societies and their structuring.
And wouldn´t we be missing something, when all the different languages would be disappearing?
Still, machines are here to help us. The language of mathematics and thus the language of computers is universal. This metalanguage should soon be able to help us understanding each other. Intelligent translation systems, that is what we humans need.

astein

8:36 am on Mar 27, 2001 (gmt 0)

10+ Year Member



Hej! (sorry, but my Swedish is rather poor ;-)

Discussion advanced quite a bit since yesterday and I'll try to provide some more information -- when submitting my last comment I was first tempted to "tout our horn" a little bit louder but then backed out in order not to appear to make an unrequested promotional posting... (Even though Lex4 is more of a showcase for our technology than a site we've got a direct commercial interest in.)

We are currently thinking about launching a new corporate site for SERUBA [seruba.com] in order to integrate the information we provided rather ad-hoc on "Published Subject Indicators" at [psi.seruba.com ]. Until then, however, you can get some more detailed information from the "Principles of the SERUBA ontology", a pdf-file you can download at the latter URL.

Lex4 is based on what we call the "Lexicosaurus" (a combination of lexicon and thesaurus), i.e. on a semantic network in which the topics and languages are linked. This data base is being edited by our multi-lingual teams; its current focus is to capture and reflect the topics of everyday communication. Each topic is represented in each of the four languages, according to its actual meaning -- that's how the query can be transformed in a meaningful way from one language to the other. (And that's how more languages can be added easily.)

Lex4 does several things:

1) Many people have a rather vague idea of what they are searching for when they embark on their search. In this case Lex4 not just releases a general query on an index but, by showing people the wider context that might be of interest to them, allows them to narrow down their query in an intuitive manner (i.e. by clicking through the related topics until they cannot find a topic that better reflects their interest).

2) Then Lex4 generates the search string for its users, i.e. the user does not need to know the (complicated) syntax of SEs or Boolean operators; at this stage Lex4 also adds possible alternative expressions and synonyms for the topic in question. (And currently we are working on a better expansion, as sometimes the expressions are not flexible enough when used as phrases. --> "it's underlying logic of related words and stemming")

3) Finally, Lex4 encourages "cross-language information retrieval" -- the feature for which I first pointed it out in this discussion, and which is accomplished through its meaning-based approach. Given that many people know some (school) English but are not confident to use it actively and that, at the same time, most documents on the web are still written in English, Lex4 provides a competitive edge for those who are at least capable of reading and understanding foreign languages. (And some of the pertinent synonyms that our Lexicosaurus contains, a non-native speaker would not come up with.)

Therefore, Lex4 is not a SE as such, but rather a search & information retrieval tool (or service or aid) that helps its users to structure and formulate a query and to "translate" it in a meaningful way into other languages.

In order to give an idea of the potential of its technology, we had to apply the Lexicosaurus on a web index, and for that we chose AltaVista. Therefore, "does it take you to the information itself?" Not really, it's actual service ends with sending the query to a web index: it helps you to formulate more precisely what you want; the results are what you get when you apply the query to the AV index.

That's enough for the moment, I suppose. However, I am quite sure our Managing Directors will be pleased to provide you with more details. Especially they could go into more details as to the technology we develop/use.

heini

11:05 pm on Mar 27, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the explanation astein. Still sounds pretty interesting.
From what I have gathered, Lex4 is directed to the users of SEs. It means to help them formulate a more acurate query, and more important, it helps them to formulate an acurate query in a foreign language, beeing englisch, spanisch, french and german.
Well excuse me folks, when I´m going over the top again, but what would really electrify me would be this technology applied directly to a SearchEngine. Imagine: you do a search in your own language, and the SE delivers results from sites in all the major languages, exactly ontopic! Combined with a really good translator such a Searchengine would make the worldwideweb make worth it´s name! A truly wonderful vision especially for the nonenglisch speaking sites! Okay, I´ll stop.
Another thing I´ve learned from your post, astein: obviously it does not work without humans. Still needed to make sense in translation processes.

astein

8:10 am on Mar 28, 2001 (gmt 0)

10+ Year Member



"... you do a search in your own language, and the SE delivers results from sites in all the major languages, exactly ontopic!"
Some misunderstanding here: That is - exactly on topic - what Lex4 does! Because, as I pointed out, it is directly applied to a SE (i.e. AltaVista). I just made the distinction between Lex4 and AltaVista to respond to rencke's comment ("I noticed that you are using Altavista technology...") and to make clear that we do not have an own index (and, thus, that we are open to any interested third party ;-)

"Combined with a really good translator..."
Well, here a headline of the Webmaster SE News [searchengineworld.com] springs to my mind: Web Portal Translates English to Malay [news.excite.com]...

As to humans, yes indeed, it doesn't work without them. We've got several multilingual teams with native speakers of all the four languages from (almost) all around the globe (not just Europe and the States but also Africa and Latin America), in order to realise not only a cross-language but also a cross-culture approach to the topics we enter in our data base. Thus we strive to make sure that our topics are not centred on one language or on one culture only (and, this way, our data base is expandable to include other languages).

rencke

8:34 am on Mar 28, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Very interesting indeed. So while search engines are struggling to provide greater and greater relevancy through ever more complex algorithms, you are in fact addressing the problem from the other end, by attempting to correct and focus the fuzzy logic of humans as they formulate the search query. And doing so across language boundaries. Add a huge index, such as Google's and more advanced machine translations than those available today and Brett's dream above will in fact come true. Thank you astein for telling us about this.

rencke

5:46 pm on Mar 30, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This discussion [webmasterworld.com] about Google beta testing translation software is directly relevant to this discussion. Just for your info.