Your question pushed me to try playing around with the inurl: operator followed by words with non-English character. It looks like Google is handling the situation quite well - which surprised me! I am sure that we're moving to a future where the variety of characters across the globe are all handled, but I didn't know we'd come this far already.
However, you may want to see links from directories, social media, other websites and so on - and they may not be so ready to deal with non-English characters directly in the URL.
It would be excellent to hear from someone who has been using this kind of url and hear about their experiences.
[edited by: tedster at 11:22 am (utc) on June 12, 2008]
It is good to use english characters in URL. But, anyway, if it must use no-english characters in URL, we should encode these characters to UTF-8, such like "%FC%AB". There are 2 benefits to do this encoding.
1. Google will index these ULRs directly, no need to encode it by google again.
2. Someone copy your page or link your page, the URL can not be changed.
The downside of conversion to UTF-8 - it looks (really) bad in the address bar.
There's no must in using any kind of URLs. However, the upside is that: It might be good to use non English characters for SEO
Visitors might prefer to see / enter a part of the address bar in their own language.
Nobody with further experience on that matter? :(
I am from Scandinavia and here we have a few non-english-letters (æ ø å etc.). I have good experience translating these letters to the following:
æ = ae
ø = oe
å = aa
For instance if you make a search for "exæmple" (not a real word) and the in the SERP there is an url like www.example.com/exaemple.html. the word exaemple will be highlightet in the SERP-url.
So at least Google knows that "ae" could be the same as "æ".
Use UTF on your own site so that it all works OK.
You many find that people have trouble linking to you when they paste the URL into their system, and their system uses a different character set.
Make sure that your 404 handling is perfect for any such duff incoming links.
The ODP migrated 5 million entries on half a million pages to UTF-8 a few years ago. There are some references to that on the web that might be worth a further read.
i am playing around with uft-8 urls. make sure they are correct encoded and also make sure google notices that everything is utf-8. i messed that part up, so i had to 301 later.
the reason that i am trying this is that it shows up nice in the google serps. it can end up giving you a headache, though.
Oh yes, UTF-8 URLs are new URLs, so you will need a redirect from the old to the new.
g1smd, you said:
|You many find that people have trouble linking to you when they paste the URL into their system, and their system uses a different character set. |
The thing is, that people with websites with my own language wouldn't have problems linking to me (am I correct?), and these are the main expected linkers to my new content. The biggest problem I have with UTF8 encoded linking is that it's really ugly to see on the browser's address bar. Any thoughts on this?