homepage Welcome to WebmasterWorld Guest from 54.166.53.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / Domain Names
Forum Library, Charter, Moderators: buckworks & webwork

Domain Names Forum

This 35 message thread spans 2 pages: 35 ( [1] 2 > >     
Web chief warns of domain name chaos
Introducing non-English letters to addresses may 'break the whole Internet'
fabricator

5+ Year Member



 
Msg#: 3164471 posted 2:51 am on Nov 22, 2006 (gmt 0)

Introducing non-English characters to website addresses could 'break the whole Internet', an expert has warned.

[smh.com.au...]

"At present there are 37 possible characters that can be used in domain names, but if non-English letters are allowed, this number would rise to 50,000 or more, said Twomey."

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Moderator's Note: Folks, let's keep the dialogue far removed from "us versus them" or "English versus any other language or culture". The object of this thread is to raise awareness of the issues that attach to the domain name system status quo and to dialogue about the benefits or problems associated with changing the status quo.

Please, do not interject any version of "us versus them" into thread. WebmasterWorld is NOT an us versus them place. WebmasterWorld is a how do we get things to work for everyone in the world wide webmaster world". (Someday we'll even have language translation software that will make it a bit easier to post in 120+ languages. ;0) )

Thank you. Webwork, Domain Forum Moderator

[edited by: Webwork at 4:43 am (utc) on Nov. 27, 2006]

 

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3164471 posted 4:39 pm on Nov 22, 2006 (gmt 0)

What he means is ICANN may not keep its stranglehold on reistrations if China, Africa, etc, can introduce their own characters.

I really cannot see users losing; Chinese users willsimply see 'their sites' and not western sites, and vice versa. The only potential losers - other than monoploists - are bilingual folk who may want to see every conceivable site - they mayneed new keyboards.

several webs working in parrallel, on the same system, with virtually no overlap is an interesting concept.

davezan

10+ Year Member



 
Msg#: 3164471 posted 5:02 am on Nov 23, 2006 (gmt 0)

A case of chicken little? :D

nativenewyorker

10+ Year Member



 
Msg#: 3164471 posted 9:55 am on Nov 26, 2006 (gmt 0)

Quadrille said:

The only potential losers - other than monoploists - are bilingual folk who may want to see every conceivable site - they mayneed new keyboards.

You forgot about the millions of people who are going to fall prey to phishing scams and other security threats.

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3164471 posted 11:06 am on Nov 26, 2006 (gmt 0)

No I didn#'t - I just don't see the connection.

Why would someone fall for a phishing scam when they couldn't even read the email?

I get 20 Vietnamese spam emails a week; never tried to follow up one of them. Straight in the bin.

Why would I? More to your point, why would anyone?

nativenewyorker

10+ Year Member



 
Msg#: 3164471 posted 11:20 am on Nov 26, 2006 (gmt 0)

You are missing the point of this thread (and forum) which is international characters being used in domain names. A phishing email can use an IDN which appear to be English, but actually consists of non-English characters. The rest of the email would be in English, so unsuspecting email recipients would be at risk of accepting it as legit.

abbeyvet

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3164471 posted 11:27 am on Nov 26, 2006 (gmt 0)

If people use the fact that an email and/or a domain name is in English as criteria for validity they are already in trouble.

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3164471 posted 3:47 pm on Nov 26, 2006 (gmt 0)

People do use the fact that an email and/or a domain name that is not in English as a possible criteria for non-validity - and common sense suggests that's a fairly safe way to avoid trouble ;)

English-speaking folk would find it hard to be fooled by a phish they could not read.

How many people would wade through a foreign language spam just to joyfully click on an 'apparently' English language domain name? :)

And I'm not sure how a domain name using non-'western' characters would appear like an English-language name. I thought the point of this thread was the use of other characters, eg African and Chinese. In domain names.

Do any of them look Engish? Can you give an example of how this would work, I'm just not seeing it. :(

[edited by: Quadrille at 3:51 pm (utc) on Nov. 26, 2006]

Philosopher

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3164471 posted 5:01 pm on Nov 26, 2006 (gmt 0)

Quadrille,

Your missing the point. What if an English email was sent, along with a link to what looked like an english domain name (a high profile one for example).

In reality, the domain name is using non english characters and not actually pointing at the real site but a phishing site.

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3164471 posted 5:30 pm on Nov 26, 2006 (gmt 0)

I hear what you are saying, what I don't get is what looked like an english domain name.

Surely we are talking about non-English characters; how would 'what looked like an english domain name', look that way using (eg) Chinese characters.

Show me that, and I'm 100% with ICANN, despite their profiteering and control freekery ;)

Sorry if I'm not clear; I've rephrased twice - and it looks a simple question to me!

pixeltierra

5+ Year Member



 
Msg#: 3164471 posted 6:18 pm on Nov 26, 2006 (gmt 0)

To quote from the article in question (to clear up the issue).

He said that this could create problems where, for example, a character in Urdu looks identical to one in Arabic. This would confuse the system and make it difficult to direct users to the right website every time.

Poor implementation of foreign domain names may also pose security risks, whereby fraud artists could create websites with names that appear identical to current English language sites, but in fact replace some of the English characters with similar-looking foreign characters.

I agree that allowing 50,000 chars in domain names will complicate matters emmensely. On the upside, it may force average Internet users to become more savvy about phishing and other issues.

Perhaps one way to deal with the ambiguities would be to force domain registrants to choose a character set for their domain so they cannot mix and match character sets with the intent to deceive. Or at lease it would be easy to have a client-side security alert that says "this domain is mixing character sets, probably with the intent to deceive you." or somesuch.

Of course there is always the problem of subdomains, which are not regulated by high level DNS.

encyclo

WebmasterWorld Senior Member encyclo us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3164471 posted 6:19 pm on Nov 26, 2006 (gmt 0)

what I don't get is what looked like an english domain name

n xmpl s ths sntn. (OK, an exaggeration, but you get the idea). It affects not just accented western characters - which are the only ones I can use in a post here due to this board's charset - but a wide range of characters which resemble the ASCII characters currently in use.

The issue is called an "IDN homograph attack" or "Homograph spoofing attack" - a homograph being a character which closely resembles another. Some reading matter if you're interested in the details:

  • Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA) [ietf.org]
  • IDN homograph attack [en.wikipedia.org]

    I think that no-one should be sticking their head in the sand with this issue - IDNs are an important step in the internationalization of the web. Why should someone who doesn't use the western alphabet have to use ASCII characters for their main identity? What's more, if progres is not made with IDNs, then there will be several markets (eg. China amongst others) who would go forward with their own system, thus fracturing the universal web at a stroke.

    Homograph spoofing is a problem, but careful forethought will hopefully lead to a reasonably safe and standardized solution.

  • gpmgroup

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 6:21 pm on Nov 26, 2006 (gmt 0)

    Do any of them look Engish? Can you give an example of how this would work, I'm just not seeing it.

    [shmoo.com...] Latest browers by "default" trap the issue.

    Quadrille

    WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



     
    Msg#: 3164471 posted 8:01 pm on Nov 26, 2006 (gmt 0)

    Yes, i think there's two issues here; the 'accented' western alphabets, and the Chinese (etc) which are totally different.

    I thougt the issue here was the 'totally different' character sets.

    I think that no-one should be sticking their head in the sand with this issue - IDNs are an important step in the internationalization of the web. Why should someone who doesn't use the western alphabet have to use ASCII characters for their main identity? What's more, if progres is not made with IDNs, then there will be several markets (eg. China amongst others) who would go forward with their own system, thus fracturing the universal web at a stroke.

    This would be the parallel web usage I referred to in my first post (above). I'm not sure why this would be 'fracturing'; surely it would all still be there - just requiring the relevant browser/pc/keyboard settings?

    I don't have them - but I don't need Mandarin!

    lexipixel

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 12:39 am on Nov 27, 2006 (gmt 0)

    Why should someone who doesn't use the western alphabet have to use ASCII characters for their main identity?

    For the same reason air traffic controllers around the world communicate in English...

    But, back to the main "story" --- I don't see any reason that at a lower level or higher level the domain name couldn't be translated with an additional step between any other character set, the current A-Z,0-9 plus hyphen valid domain name character set and numeric IP addressing.

    3 step mapping rather than 2 step.

    lexipixel

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 12:54 am on Nov 27, 2006 (gmt 0)

    "...He said that this could create problems where, for example, a character in Urdu looks identical to one in Arabic."

    Looks like and "are" be two different things.

    At the code level, every character has a numeric value, e.g.-

    Character #0065 = A
    Character #0654 =
    Character #0655 =

    For that reason, the argument of "confusion" does not hold water since the problem already exists in the current system.

    chr(49) = "O" .... chr(57) = "9"
    chr(65) = "A" .... chr(90) = "Z"
    chr(97) = "a" .... chr(122)= "z"
    chr(45) = "-"

    Take these two:

    A0L.COM vs. AOL.COM

    In the example on the left I used a "zero" for the second character, on the right its an upper case letter "o".

    G00GLE.COM vs. GOOGLE.COM
    YAHOO.COM vs. YAH00.COM

    or:

    lycos.com vs. 1ycos.com

    (used lower case "L" on left, number "1" on right)...

    Quadrille

    WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



     
    Msg#: 3164471 posted 1:35 am on Nov 27, 2006 (gmt 0)

    For the same reason air traffic controllers around the world communicate in English...

    While I agree with your other points, this one is a bit off.

    Air traffic controllers around the world communicate in English for safety reasons; having a pilot and a controller splitting language while the plane circles a busy runway could be, er, hazardous.

    But of the 2,000,000,000 Chinese speakers, it's safe to estimate that approximately 99.9999999% have no need or desire to communicate with a non-Chinese speaker.

    And for the 1,026 who do, it would be cheaper to buy them a second pc/browser/keyboard than re-equip (and re-educate) the whole nation.

    Don't you think? ;)

    It is fashionable to expect the world to fit itself around us and the language of Shakespeare, but I'm not sure the Chinese will play ball this time. :)

    [edited by: Quadrille at 1:36 am (utc) on Nov. 27, 2006]

    lexipixel

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 2:06 am on Nov 27, 2006 (gmt 0)

    I think a system designed to outlast one or more nuclear strikes in critical hubs can survive internationalization and agree it won't be solely in English.

    It will all work out --- there are no two character encoding systems that can't be translated through the univeral language... math.

    Edwin

    10+ Year Member



     
    Msg#: 3164471 posted 2:18 am on Nov 27, 2006 (gmt 0)

    He's about a million years too late as many hundreds of thousands of IDN domains have already been registered in Japanese, Chinese, Korean, French, Spanish, Arabic, Russian and many, many other languages - and many of those are already in active use by legitimate companies.

    The fact that you may not have come across any may simply be because you search the English web - if you search in Japanese on the Japanese version of Yahoo!, IDN .jp and .com domains appear fairly regularly amongst the results. Large companies (including the largest advertising company in Japan) have started to acquire the IDN of their names to use in parallel with their earlier ascii domains.

    It's just an after-the-fact attempt by ICANN to grab a bit of the action.

    [edited by: Edwin at 2:21 am (utc) on Nov. 27, 2006]

    encyclo

    WebmasterWorld Senior Member encyclo us a WebmasterWorld Top Contributor of All Time 10+ Year Member



     
    Msg#: 3164471 posted 2:44 am on Nov 27, 2006 (gmt 0)

    It might be of value to look at an earlier flaw identified in browsers back in 2005, as well as some of the reactions from the browser-makers to that reported threat.

    IDN spoofing in Firefox led to white-listing specific ccTLDs as being IDN enabled [webmasterworld.com] (see also this CNET article [news.com.com]). The Firefox method of handling IDN issues is outlined here:

    [mozilla.org...]

    The example given in tests regarding homograph spoofing was a Paypal site using a Cyrillic "a" (which I can't display here for technical reasons). With IDN enabled, there was scant difference between the visible URL for paypal.com versus the spoofed alternative. With IDN disabled in Firefox, the spoof domain is displayed as xn--pypal-4ve.com in the browser's address bar.

    This is not just an issue of local alphabets and keyboards: the universal web as a concept is that every user can connect to others across the globe without relying on plugins, differing standards, alternative DNS roots, propriatary or closed domain name systems. Even if you only speak one language, there should never be a barrier to, say, emailing someone in a different country whose email address contains non-ASCII characters.

    Homograph spoofing is one identifiable issue with IDNs which has had significant coverage, and there has been a lot of effort made into mitigating the risks posed by homographs. But the web should no longer be seen as a majoritarily English-language resource, and IDNs represent a huge step forward into making the web live up to that "universal" tag. ICANN does have a role to play, and as developers I belive we should always push for open standards which address the needs of the world's internet users rather than fixating on a technical issue which can be overcome with careful implementation.

    GrendelKhan TSU

    10+ Year Member



     
    Msg#: 3164471 posted 3:13 am on Nov 27, 2006 (gmt 0)

    this thread is funny to me and IMO just highlights how much the digital divide IS a language one. It really does look like major chicken little syndrome to me. your issues and paranoid SOUND rationale...but "break" the web? doubtful. I don't think its sticking one's head in the sand... just not as big an issue in practice.

    [[ note: I've been watching and reporting on the huge gap between korean and "english web" for years. its STILL shocking how much ignorance there is as to what is going on either side. (again, big case in point.... google has about 0.7% of the market here in Korea, very easily arguably one of the most advanced as well as important markets on the web). ]]

    guess what? there is a rEAL WORLD EXAMPLE.

    Did you know korea (again, huge #1 broadband internet penetration and usage and crystal ball for much of most recent web trends) has had and been using a KOREAN CHARACTER domain system for YEARS? it works parallel to the "normal" web. (ie: you can buy a regular domain name and a "korean character" one.)

    any problems on your end of the internet world? ever receive a Korean domain name spam? if you did....did you know? (and yes, I'd go so far as to say the average Koreans is internet savvy than the ROTW... and spam insanely..to a globally significant degree. so yes, you would have if it was an issue.)

    nuff said.

    still... you want paranoid? this smells more like english speakers not wanting to "lose control of the web", more than any worries about fraud. the "parallel" the current web is much more accurate analogy than other break the web theories. gosh forbid that the english speaking world need to learn another language to access a big part of the "other web".

    so sayeth GrendelKhan{TSU}

    Webwork

    WebmasterWorld Administrator webwork us a WebmasterWorld Top Contributor of All Time 10+ Year Member



     
    Msg#: 3164471 posted 3:25 am on Nov 27, 2006 (gmt 0)

    Moderator's Note: Folks, let's keep the dialogue far removed from "us versus them" or "English versus any other language or culture". The object of this thread is to raise awareness of the issues that attach to the domain name system status quo and to dialogue about the benefits or problems associated with changing the status quo.

    Please, do not interject any version of "us versus them" into thread. WebmasterWorld is NOT an us versus them place. WebmasterWorld is a how do we get things to work for everyone in the world wide webmaster world". (Someday we'll even have language translation software that will make it a bit easier to post in 120+ languages. ;0) )

    Thank you. Webwork, Domain Forum Moderator

    [edited by: Webwork at 4:43 am (utc) on Nov. 27, 2006]

    lexipixel

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 6:04 am on Nov 27, 2006 (gmt 0)

    Same topic --- different path:

    For the past few weeks I've been working on a site for a customer who uses the umlaut, (ü, or ü or simply ""), in their product name.

    Let's say they have a product called:

    Blue Wdgits

    Using only [A-Z,0-9] they registered a name like:

    bluewudgets.tld

    Now, within their web pages I encode the umlaut using ü thinking it will not trip the "See English Language Only Results", (yes, we are back to SEO)...

    Whadda ya think? Do the SE's consider a page containing an encoded non-English language character to be English or not?

    jomaxx

    WebmasterWorld Senior Member jomaxx us a WebmasterWorld Top Contributor of All Time 10+ Year Member



     
    Msg#: 3164471 posted 6:18 am on Nov 27, 2006 (gmt 0)

    BTW, you're probably aware of this, but "" is traditionally transliterated into our character set as "ue" (e.g. Mller <> Mueller). Just one more crazy SEO/trademark complication to take into account.

    webdoctor

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 7:03 am on Nov 27, 2006 (gmt 0)

    A phishing email can use an IDN which appear to be English, but actually consists of non-English characters. The rest of the email would be in English, so unsuspecting email recipients would be at risk of accepting it as legit.

    How about a radical solution?

    In the future, all email client software will block clicking of links, perhaps won't even highlight the links in emails - i.e. you'll have to retype an URI you get in an email into your browser. (Note the big banks, eBay, PayPal and friends have been recommending we do this for ages).

    You could perhaps imagine allowing copy-and-paste but with a warning ("You are copying characters from an alternative character set into IE's address bar - are you sure? Yes/No)"

    This would pretty much solve the problem at the expense of making life slightly harder for all the legitimate users. But hey, that's how airport security works too.

    lexipixel

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 7:27 am on Nov 27, 2006 (gmt 0)

    I started reading up on Asian fonts, and saw Windows Vista supports East Asian characters natively.

    So I backtracked to find out how its supported.

    Got real close to home and found the company proving the fonts is a "neighbor"... and they had a press release on their site about the deal to provide the fonts.

    WOBURN, Mass., USA, Sep. 18, 2006 Monotype Imaging Inc., a global leader in font and imaging technologies, has acquired China Type Design Limited, a typeface design and production company based in Hong Kong. As a wholly owned subsidiary of Monotype Imaging, China Type will help lead expansion into Asian consumer electronics and printer markets which require scalable, multilingual text solutions.
    [katakanafonts.com...]

    vite_rts

    5+ Year Member



     
    Msg#: 3164471 posted 1:28 pm on Nov 27, 2006 (gmt 0)

    @GrendelKhan

    Are you saying that you can type in hangul into the address bar?

    secondly,

    Are Korean domain extensions available to foreign buyers

    thirdly

    My pc is configured for Chinese an Korean script, I am kinda studying them, so would I be able to visit a hangul name site

    cheers

    the_nerd

    WebmasterWorld Senior Member 10+ Year Member



     
    Msg#: 3164471 posted 4:04 pm on Nov 27, 2006 (gmt 0)

    Introducing non-English letters

    You might accuse me of splitting hair - but this is the first time that I read about "English letters". I always thought those were Latin ones?

    GrendelKhan TSU

    10+ Year Member



     
    Msg#: 3164471 posted 4:20 pm on Nov 27, 2006 (gmt 0)

    @GrendelKhan
    Are you saying that you can type in hangul into the address bar?

    secondly,

    Are Korean domain extensions available to foreign buyers

    thirdly

    My pc is configured for Chinese an Korean script, I am kinda studying them, so would I be able to visit a hangul name site

    cheers

    1. yes.
    2. yes.
    3. yes.

    woot!

    Quadrille

    WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



     
    Msg#: 3164471 posted 4:24 pm on Nov 27, 2006 (gmt 0)

    Introducing non-English letters

    You might accuse me of splitting hair - but this is the first time that I read about "English letters". I always thought those were Latin ones?

    I'd certainly accuse you of that ;)

    You are right, of course, but in this thread I think it's making a distinction between 'English language specific Latin characters' and 'Latin characters which also include those with accents etc.' as opposed to 'other' characters, such as Chinese and African.

    I hope that's cleared that up :)

    This 35 message thread spans 2 pages: 35 ( [1] 2 > >
    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / WebmasterWorld / Domain Names
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved