|Alternate language for exact same URL with ASP.NET localization?|
This is a new one for me. I have a client with an English-language site that's been around a long time, well-indexed and ranked by Google, etc. It's a Commerce Server / ASP.NET site.
So I just learned that they're planning on creating a Spanish-language version of the site, but instead of just duplicating the pages in another directory or domain, they're going to use ASP.NET's globalization & localization functionality to set the languages on the fly based on the user's language settings in their browser.
For example, if, upon requesting the homepage example.com/default.aspx, the visitor's browser's Language Preference setting is set to "Spanish (Mexico) [es-mx]", the server would deliver the Spanish-language version of that page instead of the usual English version. Either way, the URL would remain the same: example.com/default.aspx. Re the content itself, for any page on the site, the Spanish-language version would supposedly have the same content, just translated from English into Spanish.
My initial though is that, while this setup may not have an impact on their current English-language rankings, it would at the very least prevent indexing and ranking for the Spanish-language versions of the pages. But then I start thinking about whether this would be interpreted as cloaking, serving up one version of content to the spiders but another to the visitors, etc., which would set the site up for penalization.
Has anybody ever heard of this type of language swapping via ASP.NET (or any other platform)? What impact would you expect this to have on the site's current rankings, let alone on future Spanish-language rankings?
You're right that current English rankings may not suffere. But the translated pages need to have their own url. Remembering that googlebot will use an IP assigned to the US, you need a URL that will serve googlebot the translated content. Using the exact same URL can completely bury that content for search purposes.
Thanks tedster! When you say that Googlebot will use an IP assigned to the US, that doesn't mean that the Spanish-language pages need to be on a separate IP block, does it? I would *think* it would be OK for the site to be in a separate directory, e.g., example.com/es/..., and Google would then be able to index it properly (assuming they new directory has a good sitemap, etc.)
No, it doesn't mean anything like that, thank goodness! It just means that server configurations need to be careful if they serve content based on the location of the user-agent's IP address. There certainly is a risk of unintentional cloaking and all the problems that can bring.
OK, thanks very much for confirming (and for the speedy reply!).
...And may I follow-up for tedster for anyone else who has any experience with these types of issues?
When I showed the client this post, they then asked whether it might make any difference to Google that they would be serving the content for a page based on the "accept-language" request header.
Here's what they mean: When I change the Languages preference in Firefox to make "Spanish/Mexico [es-mx]" the top preferred language, and I then browse to Google.com, Google sends me back the Google Espanol homepage, but it has the exact same URL as the English-language homepage. In other words, Google itself is smart enough to be able to deliver content for a URL based only on the presence of this header in the request:
Based on the fact that Google itself understands the Accept-Language header, the client wonders whether that might mean that Google is "smart" enough to index same-URL-different-language content separately.
There are two separate issues here - you (or your client) is mixing up Googlebot (the spider) and GWS (the web server used to serve Google's site).
Google's web server uses the Accept-Language header sent by your browser to deliver content in your preferred language. However this does not mean that Googlebot sends an Accept-Language header when visiting your site - it doesn't, so it will be served the content in whatever your server determines is the default language. In this scenario, the Spanish-language content would remain invisible to Googlebot and thus would not be present in the index.
The fact that Google uses the technique your client is suggesting on their own site is not a recommendation in itself - Google is one website which doesn't have to worry about how it is indexed in its own index. :)
You first question is: are you targeting the language (Spanish) or the location (Mexico)?
If I understand correctly, you are targeting Mexico. In this case, what you can do is use the
.com.mx version of your domain and point it to the same server. Then in your programming logic, you can use the Accept-Language header as suggested to deliver the content in the preferred language of the user.
The difference is how you handle requests when no Accept-Language header is present (such as with Googlebot). You test the
Host: header sent by the user-agent. If the request is for your .com then you serve English by default, but if the request is for .com.mx then you serve Spanish by default. In this way, Googlebot will see two sites, the .com in English and .com.mx in Spanish. The local domain will help in being included in the "páginas de México" local index.
|The fact that Google uses the technique your client is suggesting on their own site is not a recommendation in itself - Google is one website which doesn't have to worry about how it is indexed in its own index. |
Right, and I more-or-less made that point to them myself, but I really appreciate the backup. [:-)]
|You first question is: are you targeting the language (Spanish) or the location (Mexico)? |
Actually, it's language, not location, so, in truth, it'll probably be "es-es" instead of "es-mx", which I just used in this thread for the sake of a quick example.
I think, thanks to tedster and encyclo, they'll now end up --at a minimum-- showing the Spanish-language pages in a separate directory instead of sharing the exact same URL as the original English-language pages. Many thanks for the quick and thoughtful responses.
Subdomains are one option if you are targeting language rather than location. Use
es.example.com will allow you to have the same URL structure, again dependent on the
Host header to determine the default language.
From a usability point of view, you should also consider offering the option to switch language via links on the site, using cookies to set a default language which overrides the Accept-Language choice.
also make sure you implement a sensible and consistent behavior when there are conflicts between the implied language of the requested (sub)domain and the Accept-Language choice.
there's also the unsupported cookies problem.
the typical answer for that is to maintain a session or language choice url parameter.
encyclo, thanks for recommending subdomains and links between the two sites.
phranque, thanks for getting me to think about instances when the visitor's browser doesn't present an accept-language header. Re cookies-vs.-query-string-parameters, my initial inclination is to avoid a parameter, as I'd like to minimize the possibility of folks linking to non-canonical versions of a given page (e.g., instead of 100% of in-links to the homepage going to "...default.aspx", 35% instead to go "...default.aspx?language=1", diluting "default.aspx"'s link popularity).