homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

This 38 message thread spans 2 pages: 38 ( [1] 2 > >     
Creating a Multi-Lingual Website
Subdomains? Folders? GET?

 1:04 am on Mar 27, 2008 (gmt 0)

There are many ways of doing a multilingual webpage, but i was wondering what works best. Please note I am not talking about using an Auto Translater. I have language packs using the DEFINE system in php. However, I would like to know how I should go about actually implementing the languages, and HOW to do it.
I see many websites with es.WebSite or whatever, and that to me looks the slickest. I figure I could do that and then have a .htaccess file or something, but my question is how exactly should I go about implementing multiple languages easly using subdomains (or if I have to folders or GET (like site.com/index.php?es)



 2:27 am on Mar 27, 2008 (gmt 0)

Welcome to WebmasterWorld teamcoltra.

For SEO purposes and to reflect local market preferences I use local ccTLD domains for each language site. Make sure each site is using the proper character encoding so that it will display properly.

> implementing multiple languages
I have no idea about your back end or what you're using so that would be a tough one.


 3:58 pm on Mar 27, 2008 (gmt 0)

I was thinking of somehow using HTACCESS to make it so anyone who uses es.mysite.com would just keep them on mysite.com but maybe add a ?es to the end of the script? Something like this so it looks clean, but it makes it more functional. Would this work?


 4:19 pm on Mar 27, 2008 (gmt 0)

You could implement content negotiation. Browsers send the preferred languages of a user to every website they open and the site can decide which content to return based on these variables. In that case www.example.com/index would return different content for a Spanish visitor than for an English one.

There are some search engine duplicate content and index problems [webmasterworld.com] that might occur if you don't implement it correctly, but I have it working with good SE rankings for all language variants on a few domains.


 8:11 am on Mar 28, 2008 (gmt 0)

Content negotiation gone wrong can be a real minefield. You've really got to know what you're doing or you can get some very unhappy users and less than glorious positioning in the SERPs. I tend to stay clear of sites that attempt to set preferences like that. They're usually wrong.

For example, just because I'm surfing from a Japanese IP doesn't mean I want content in Japanese.


 9:26 am on Mar 28, 2008 (gmt 0)

you might be ok making some initial guesses based on browser preference or ip location or referrer but you should make it easy to change language and keep the language selection for return visits if possible.


 9:34 pm on Mar 28, 2008 (gmt 0)

Hi Bill,
content negotiation should not be based upon an IP
the way it needs to be set is by searching for the browser cookie pref lang
or analyzing browser average lang pages

Have a look Here [w3.org]

indeed it calls for experience and knowledge
however it could work and it does if well implemented


 7:28 am on Mar 29, 2008 (gmt 0)

>>indeed it calls for experience and knowledge
however it could work and it does if well implemented

I live in one country, my computer is set to the language of another, my browser is set to my mother tongue and I can use three languages and get by in more.

phranque's point about user-changeable settings is the crucial one here (note the user rather than the site-owner).


 7:52 am on Mar 29, 2008 (gmt 0)

and similar to stever's point, the w3c negotiation doc linked by henry0 warns to give the users, not the browsers, what they want.

definitely a good read to get a sense of the issues.

oh and welcome to WebmasterWorld [webmasterworld.com], teamcoltra!


 10:12 am on Mar 29, 2008 (gmt 0)

bill, do you heavily interlink your different domains?


 10:30 am on Mar 30, 2008 (gmt 0)

Heavily? That depends on the sites. If they're direct translations then I make it easy for the user to find the other language versions.

Mathieu Bonnet

 11:01 am on Mar 30, 2008 (gmt 0)

As far as I'm concerned, I use URLs like "/en/Home.xhtml", or "/fr/Accueil.xhtml". I redirect "/" to either of them, using Apache content negociation system (and I do not use negociation for any other URL). Then at the very top of every pages, I have a right-aligned paragraph saying "(Switch this page language: fran├žais)" (the text is in the current language, but the language name is localized). The language name links to the page in the selected language.

I don't like subdomains (they dilute the domain name importance), and ccTLDs are for countries, not languages (of course, if I had country-dependent websites, I would use the proper ccTLDs -or maybe use three-letter country codes, in the URLs, if I don't have enough money).

The Apache content negociation uses the language configured by the user. The default language is the language of the browser interface, which should generally be quite ok.

If it is not, then the user will have to use the link at the top of the page, for the home page, if he does not bookmark the home page in its preferred language.

Internally, my directory structure follows the URLs ("./htdocs/{en,fr}/*"). In these directories, I have my static content. For dynamic content, I use mod_rewrite to check a cache directory, first, and if the page is not cached, I'll generate it and cache it (by redirecting, internally, to a PHP script). For personalized content, I just skip the cache, and return the generated page directly.

[edited by: Mathieu_Bonnet at 11:27 am (utc) on Mar. 30, 2008]


 11:09 am on Mar 30, 2008 (gmt 0)

hi, we released a second language of our site last year and have done the following:

1.) we installed NGINX (open source high-speed proxy) on a linux box in the country of the new language (.de)

2.) we have "tagged" every piece of language in our 300 PHP pages (that took over a week) with _translate("tag","This is the text") and run a dictionary where the languages are stored

3.) we have installed memcached, a memory cache for PHP, where you can put the language to avoid database access

4.) the reverse proxy calls "de.someenglishdomain.com", which points to the SAME pages like www. - our PHP header sees the de. host and switches to german

There are some other things to do, but that is the basic setup based on the idea, that local sites IN the country have a better chance of ranking, than german pages in the states. It looks like it is working fine!



 11:18 am on Mar 30, 2008 (gmt 0)

For SEO purposes and to reflect local market preferences

Bill, this is related to your first response.

What SEO advantages does a separate tld have over using a separate directory? www.example.com\fr vs www.example.fr, won't you be better off cost and SEO wise to go with the former.

In addition to that, I am not sure if language does necessairly imply a location.

Having a spain tld for spanish might imply that it is meant to target the visitors in spain but less so the rest of the spanish speaking population, while www.example.com/es could imply the version of the same site content in spanish regardless of the visitor's location.


 1:35 pm on Mar 30, 2008 (gmt 0)

What SEO advantages does a separate tld have over using a separate directory?

(I'm straying away from our focus on language here, but language and localization are related...)

You will do much better in the country specific SERPs if you:

1) use a ccTLD
2) host in target country
3) have target country whois
4) set webmaster central country settings to target country

Although the above can be costly and complicated to manage. From my experience, the benefits outweigh the costs.

#2-4 are dependent on #1 although the (cheaper) alternative to #1 is to use subdomains (each subdomain could be set up to resolve to a different IP so it can be hosted in the target country). However, I believe that this is much less effective than using ccTLDs.


 2:11 pm on Mar 30, 2008 (gmt 0)

You could implement content negotiation.

Don't do this. Not with languages.

For example, just because I'm surfing from a Japanese IP doesn't mean I want content in Japanese.

Quite. And just because I'm surfing from a UK IP doesn't mean I don't want the content in Japanese.

You can't pre-empt what language the user would like their content in so it's better not to try. Instead just give the user as much control as possible.

Is there any sort of industry consensus over which of these would be best practice for a Spanish translation of a page about Augusto Pinochet (former Chilean dictator) on a predominantly English language website published in the US:

1) www.mysite.es/myfolder/mypage.html
2) www.mysite.com/es/myfolder/mypage.html
3) es.mysite.com/myfolder/mypage.html
4) www.mysite.es/myfolder/mypage.php?lang=es&loc=us

5) Something else, possibly involving .cl (the ccTLD for Chile) ...?

I'm thinking either 2), hosted in the USA, or 3) hosted in Chile (if all the articles on the site are Chile related), or else 3) hosted in Spain.


 2:54 pm on Mar 30, 2008 (gmt 0)

You will find a discussion about this here: [webmasterworld.com ]. It is old, but still very pertinent.

If you take a look at the site in my profile and look at how the langauges have been handled, including links on all pages to the corresponding page in all other languages, you will find an approach that has been paying off handsomely for the past six years. These sites have excellent rankings in Google - typically top 5 - for all of the most important keywords.


 4:23 pm on Mar 30, 2008 (gmt 0)

rencke, thanks for that - that was very helpful.

I read through your excellent article and had a look at your site.

Would you then consider that in the context of the example I gave above, the best of all solutions would be:

5) www.misitio.cl/micarpeta/mipagina.html

with the site hosted in Chile?


 4:37 pm on Mar 30, 2008 (gmt 0)

One may simply explain a user how to set up browser's lang preference, which will be done once for all in two clicks.
Then content negociation will work fine.
if in such a scenario it does not work then I need to understand why...

<edit> Not to mention that if you purchase a machine in any contry, your language preference is set by default to the country of purchase.
Nevertheless you may set it any way that please you
even with a first choice and second choice etc. </edit>


 4:53 pm on Mar 30, 2008 (gmt 0)

Because, henry0, your answer is the exactly the same as explaining to people that they need Flash to see your website, or that they need to have their screen as 1024.

It means that you think that your website and you are more important than your viewers and customers.


 5:21 pm on Mar 30, 2008 (gmt 0)

OK forget about about the first part;

Did you read my edit? What's wrong with the default language, content negociation will pick it up ... Done! No need to click on anything.


 5:28 pm on Mar 30, 2008 (gmt 0)

Why not just let people click on what language or what subdomain they want to see? I might want to read what there is in German. Or I might have a French friend over who I have recommended your site to. Or I might have...


 5:33 pm on Mar 30, 2008 (gmt 0)

Is it not advisable to run two pages:


Both with the same content but in their respective languages?

Would these be seen as duplicate content?

I'd have thought that no matter where the user is in the world, they would search in their prefered language and thus find the appropriate version of your sites pages.


 5:41 pm on Mar 30, 2008 (gmt 0)


i totally agree:

if you register a .de via godaddy (eg., not sure about the other registrars), you get a german trustee as the whois info. The same for most other european countries.

if you then get a VPS with a reverse proxy, you are fully targetted in your country of choice:

1.) tld is there
2.) whois is there
3.) IP is there
4.) language is there

if you are linux savy for costs below $200 per year and local domains just rank better, I have serveral .coms in europe to show you proof that they are NOT ranking in US serps - despite incoming links and weak keywords!



 8:58 am on Mar 31, 2008 (gmt 0)

Would these be seen as duplicate content?

I am not really sure. The same content in different languages, would it be generally considered a duplicate content. Well, I think it shouldn't, if you specially consider the generally acceptable approach of creating content for visitors and not search engines.


 10:37 am on Mar 31, 2008 (gmt 0)

Would these be seen as duplicate content?

quite simply, no.
as a matter of fact, if done properly, it is probably not even a direct translation!


 11:34 am on Mar 31, 2008 (gmt 0)

Would you then consider that in the context of the example I gave above, the best of all solutions would be: www.misitio.cl/micarpeta/mipagina.html with the site hosted in Chile?

Yes, assuming that both misitio and /micarpeta stand for important keywords. I have not found that hosting is really an issue. All of our sites are hosted in the UK, and that has not represented a problem for us, even though 20 language areas are involved.

example.com/en/good, example.com/fr/bon Both with the same content but in their respective languages? Would these be seen as duplicate content?

NO! Don't worry! All of the sites in my profile are identical except for language and that has not hurt us one bit in six years. But you may want to reconsider the name of the folders. /en and /fr are not good solutions SEO-wise. /my-service-in-english and /mon-service-en-francais will help rankings in Google if "my-service" and "mon-service" are the #1 most important keywords in each language.


 11:55 am on Mar 31, 2008 (gmt 0)

One site that demonstrates multiple languages very well - is wikipedia.org - they use a subdomain for each language (two letter) - eg. en.wikipedia.org pt.wikipedia.org - I think we can all agree that this approach hasn't harmed their SERPs.


 12:10 pm on Mar 31, 2008 (gmt 0)

I think we can all agree that this approach hasn't harmed their SERPs

If you can persuade Google to turn up your site to the very top, then you have nothing to worry about. A press realease informing the world that you intend to start a search engine that will make Google obsolete, might do the trick. ;-)


 7:21 pm on Apr 1, 2008 (gmt 0)

I really don't see how you can tie a ccTLD to a language. Many countries have multiple official languages, e.g. Canada and Switzerland, and in some areas other unofficial languages are also very prevalent. It's better to not usee ccTLDs unless you do business in that country. So if you have a local business selling widgets in Zoozooland but ship them worldwide, you may want to stick to example.zz and translate the website into multiple languages, but if you sell widgets worldwide with offices in many countries, you may have separate domains for each country, and translate some of these to the local range of languages. See major computer vendors for the latter.

While on the subject, internationalization (i18n) is the concept of making your software/website ready to be translated into different locales, a locale denotes the combination of language and conventions for a given culture, and localization (l10n) is the act of translating your software/website to a specific locale. So you have the locales en-US, en-CA, fr-CA, fr-FR, etc. Localization may also involve changing date/time formats, colors and icons. For your particular product, you may want to reduce a "locale" to equate a language.

The OP asked about multilingual websites. Leaving SEO aside for now, either using subdomains (en.example.zz), subfolders (www.example.zz/en/), or a parameter on the query string (www.example.zz/?l=en) works fine. Subdomains: keep in mind that some people will always add a www. to the front of the domain name, you may also need to do additional handling for things like carrying cookies across. Subfolders: will keep your URLs clean, i.e. without any ?xx=... in them. Query string parameters: if most of your URLs already have a bunch of query parameters, you might as well add the language parameter to it, otherwise use subfolders to keep the URLs clean. For the first two methods, you can parse the $_SERVER['SCRIPT_URI'] parameter in PHP to figure out which locale is required. But then again... you may also want to translate the domain name itself: example.com, exemple.com, ejemplo.com.

I don't pay much attention to SEO, but i'd ignore any search engine that would penalize a website because they have the same webpage in multiple languages, such an engine would penalize all Canadian government websites for one thing, many major businesses in Canada, Switzerland, etc., and many websites catering to international sports... As it's been pointed out, you may want to go the extra mile and translate the folder and page names as well, and use mod_rewrite or something to handle the differences.

For deciding what locale to use initially, you only need to do this if someone visits an URL where the locale is not indicated, so you'd do it for www.example.com but not www.example.com/somelocale/buy-widgets.html. For this, definitely don't use the visitor's IP, but it's a safe bet to use HTTP_ACCEPT_LANGUAGE to make an educated guess for the user's preferred language (keep in mind some browsers won't send this at all). So if someone goes to www.example.com, either show a splash page and let them choose from the locales available, or use HTTP_ACCEPT_LANGUAGE and direct them to www.example.com/preferredlocale/index.html, or simply go to www.example.com/defaultlocale/index.html. No matter the method you use, always give the user the option of switching languages on every page.

This 38 message thread spans 2 pages: 38 ( [1] 2 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved