Forum Moderators: buckworks & webwork

Message Too Old, No Replies

æøå domains

When will it be safe to start using non-english characters?

         

requiem

9:43 pm on Jun 3, 2003 (gmt 0)

10+ Year Member



A client of mine have a some .com and .net domains planned for development using non-english characters. The question is when will they work globaly, and when will the large serachengines start listing these domains?

Is there anyone here that could make an educated quess?

Thank you.

max_rk

10:01 pm on Jun 3, 2003 (gmt 0)

10+ Year Member



The talks began about two years ago about these domains. I even registered one. I was unable to host it or do anything with it. All software and systems written in the way that do not accept "other characters". To write software like web server or search engine does take some effort and time. Americans not interested in this (maybe from potential business point of view), Europeans doing OK without it. Who will influence development of these domains, its hard to see. In my opinion, your client would be better off using standard character for at least another 5-10 years.

Max.rk

takagi

3:45 am on Jun 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Same problem here in Asia. Few years ago they started registrating domain names with special characters (see Domain names in Chinese [webmasterworld.com] and Japanese-language domain names [webmasterworld.com] threads) but it never was a success.

StanBo

3:14 pm on Jun 5, 2003 (gmt 0)

10+ Year Member



Look at it this way - would you, say, install the Arabic keyboard layout if you can't read Arabic? Will you ever bother to learn the meaningless (for you) combinations of alien drawings to type it down later on?
I wouldn't.
Granted, the chances that local (ex. Korean) search engines are quite likely to start listing domains in their native alphabet, but why Google might want to do that for God knows how many charsets, which are absolutely alien for overwhelming majority of Google users?

requiem

7:00 pm on Jun 5, 2003 (gmt 0)

10+ Year Member



Google is an international searchengine and it is possible to search for pages written is a spesific language. That is the reason why at some point Google might start listing arabic, korean, and other none-english character domains.
After all the majority of web users do not have english as their primary language.

So the question is when will there be a global system for domains with nono-english characters, not if there will ever be one. At sometime all internasjonal search engines probably will list domains in any given character set, but when? Six monts? 2 years? 50 years?

StanBo

7:47 am on Jun 6, 2003 (gmt 0)

10+ Year Member



If we're talking such numbers, my estimate is 50+. IMO Internet, as we know it right now, will not live that long. Neither will Google (once again as we know it now)
And the reason is very easy - we're not talking languages here, we're talking alphabets. And a vast majority of even those who does not have English as a primary language use almost the same charsets. In case of Cyrillic, Arabic, Korean, Chinese, Japanese and many other alphabets, the number of actual internet users is nowadays limited. And their percentage is few.
What increase in profit can they bring? And what expenses will such an innovation require?

takagi

9:45 am on Jun 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I agree that for an international company, it wouldn't be smart to have a domain name that is hard to enter on a US keyboard.

But even major West-European languages like German, French, and Spanish have several special characters in common words.

Suppose you are the webmaster of a 50-year old local amateur soccer team in Germany. You cannot register the team name because it contains an umlauted vowel (ä, ö, or ü) or eszett (ß). Or you cannot register the name of a village school, because it contains an accent aigu (acute "´"), accent circonflex (circumflex "^") or tilde ("~") which is quite common in French resp. Spanish. The pages will be in the local language anyway. So why would you need to replace the eszett by "ss" for the domain name? That doesn't make sense in 2003.

The number of Non-English users of Google grew from 33% to 50% in the last 2 years (see the graph Languages Used to Access Google March 2001 - April 2003 at Google Zeitgeist [google.com]). There are now more mobile phones in China than in the USA. In March, South Korea and the USA had the same number of xDSL subscribers (both 6.5 million), and Japan had 7.5 million at the end of April. So the influence of English on internet will only get smaller.

BTW, Google already indexed lots of pages with special characters in the URL. So Google won't have problems with a special character in the domain name once it starts to become normal. Google can find pages that match the Japanese, Thai, Arabic or Russian words entered in the search string. So internally Google most likely will index these words in 16-bit unicode instead of 7-bit ASCII. And so will the other major search engines.

StanBo

2:04 pm on Jun 6, 2003 (gmt 0)

10+ Year Member



Maybe...
But look at it from different perspective. You've made a simple search (because it's the way most non-webmaster surfers do) and got a top 10 from which you cannot fathom a single letter. Will you ever visit such an SE again? For localized searches it makes sense to index and display non-english urls, but speaking web-wide...
I'm still pretty doubtful

requiem

2:22 pm on Jun 6, 2003 (gmt 0)

10+ Year Member



StanBo wrote:
"Maybe...
But look at it from different perspective. You've made a simple search (because it's the way most non-webmaster surfers do) and got a top 10 from which you cannot fathom a single letter. Will you ever visit such an SE again? For localized searches it makes sense to index and display non-english urls, but speaking web-wide...
I'm still pretty doubtful "

There are several possible workarounds for this problem.
In many of the major search engines you can already choose only to list pages in a particular language. It is also possible for the search engines to serve different results based on ip-adress, or browser sniffing. However when when someone enters a query like "bølgen blå", it is very likely that that person is Norwegian or Danish. It would be the samething for chinese, hebrew or arabic queries. I have no doubt that international search engines someday will be truly international, but the question is when. When will it be safe to use non-english domains?

takagi

2:32 pm on Jun 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> got a top 10 from which you cannot fathom a single letter
Today I read another thread about missing snippets for right-to-left languages (like Hebrew, Arabic and Urdu) and I checked it on the Israelian site of Google. Until today I never saw a snippet in Hebrew, but there were lots of them (although some were missing). Why didn't I see them before? Because I never search for words in Hebrew! If you look for an English word, you will usually only see English pages (well, sometimes an English word is used as a loanword in other languages). There is no need for SEs to do tricks like geo-targetting to prevent you from seeing pages in different 'alphabets'.

Hagstrom

9:29 am on Jul 4, 2003 (gmt 0)

10+ Year Member



This is old news, but I haven't seen it posted yet on WW:

Since a few weeks ago - if you enter one of the æøå-domains, like Carlsberg's www.øl.com, in Internet Explorer - you'll get a screen saying "welcome, are you trying to visit www.øl.com?". You can then choose to download Verisign's plug-in or visit the page without downloading anything.

The remarkable thing is that this even works with my old IE4, which hasn't been upgraded for years.

Lisa

11:12 pm on Jul 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Verisign has jacked the root server to do this. They look for non ascii characters.

claus

11:44 pm on Jul 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



StanBo
... say there are 280 million Americans. Not 100% are on the net.

Now, there's a billion Chinese. If just 30% of the Chinese population had access, there would be more Chinese on the net than there are Americans grand total. This will not take 50 yrs+

/claus

<edit>typo</edit>

Hagstrom

10:12 am on Jul 5, 2003 (gmt 0)

10+ Year Member



Verisign has jacked the root server to do this. They look for non ascii characters.

Could you please elaborate on that: Why does it only work with IE and not with Netscape? Why does it only work for WWW.domain.com and not for domain.com without WWW?

Lisa

4:46 pm on Jul 5, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hagstrom,
The root name servers of the Internet exam the domain, if they see you request exæmple.com, then it would send you to the .com root server. The .com root server then would see that you are requesting a domain with a special character(s). It will attempt to locate your domain but it will fail because all special characters are encoded back to ascii in the zone (no special characters exist in the zone). When the plug-in requests a name it would look something like this "gh--eeeexxxxaeaemmmmpppplllleeee.com". (Please see the RFC for specs on encoding). (BTW, no one can register domains with "--" in the third and forth position, so don't try and encode on your own). DNS should always fail if the name doesn't exist. But Verisign has done one more thing, they make sure your request looks like it is a web browser requesting it, so if they see the "www." + non-ascii + ".com" like a request for www.exæmple.com. You are most likely not a web browser and they feel it is safe to resolve your false name and give you a special IP address with a web page to tell you about the plug-in.

For Netscape I can only guess that the browser follows the DNS specs and knows DNS can not have encoded characters, so it doesn't even try. Mean while IE just blindly tries.

Hope that answers your question.

takagi

4:46 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It said the number of Internet users in the world totaled 670.8 million at the end of 2002. The Asia-Pacific region accounted for 32.1% of the total, surpassing Europe as the world's largest user by region for the first time. Europe had a 31.9% share.
(Yahoo Asia [asia.news.yahoo.com])

claus

11:30 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thanks for the link takagi, those are interesting figures :)

>> It said ...

The article refers to the research method as being an online questionnaire. The firm doing the research was Impress. Is it possible that you could provide some extra info, to better judge the validity of the figures:

Is that the media company Impress Group? How were the respondents recruited? Was it a random pop-up like the ones you see everywhere? How many respondents answered the questionnaire? What was the geograpcical scope of the recruitment and how representative was it across countries?

I realize it's a lot of questions, but the article doesn't answer a single one of them, and i'm not able to conduct a search in Japanese myself (lack of language skills)

News articles rarely mention such "technical details". It is, however, such questions that decide if the outcome can be trusted at all or not. I hope you can help me to get an answer to at least a few of them, as it is not often that i see figures on that scale.

Thanks
/claus

takagi

12:20 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> The firm doing the research was Impress.
No, Impress [impress.co.jp] is only the publisher. The book is the annual white paper of IAjapan [iajapan.org] (Internet Association Japan) released last week.

My feeling is that they compiled the international figures by using reliable data from other countries combined with their own information about Japan.

I don't have this 400-page book so I'm afraid I cannot answer the other questions. Some more information can be found in the thread: Broadband customers in Japan 10 or 20 Million? [webmasterworld.com] Besides, it would be off topic in this thread.

claus

1:31 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



takagi, thanks a lot, anyway :) The thread you pointed me towards is very interesting, there are definitely developments in japan and the rest of asia worth watching.

bill

7:22 am on Jul 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been watching the status of these names for years and it all boils down to the basic problem that it requires the user to have some sort of plug-in installed in their browser in order for it to work. RealNames came close to being the defacto supplier of this plug-in as they were included in every copy of IE. When Microsoft dropped them it spelt the death knell for RealNames and their plug-in, and shortly thereafter a lot of registrars in Japan and other areas stopped promoting the sale of international domain names. I know a lot of people who threw their money away on these domain names. Unfortunately the hype about theses names came before it was technically possible for them to work globally. I'm not convinced this day will come very soon.