homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 37 message thread spans 2 pages: 37 ( [1] 2 > >     
Multi-lingual pages - language identification
and how Google treats them

 5:08 pm on Oct 29, 2002 (gmt 0)

My indexpage is a multi-lingual (five languages) starting point.

That is, the majority of the bodytext and the title is in English,
but there is also some bodytext in the other languages and amongst others in pulldown menus.

However, if I do a search in a local Google version, with the option: "Pages in that language" only, the indexpage will not turn up. That is Google has labelled it as English and English only.

The question is, if this is reasonable, as e.g. many external Dutch sites or directories would link to the index page by default, with in many cases Dutch anchortexts and their is Dutch content on Page.

It looks like the only way out is always going for multiple single-lingual sites, with their respective tld's and thereby also catching the "searches from this country only" option.

Does anyone know how Google classifies the language of a page?

Is it on percentage bodytext? title? majority of language of inbound linked sites?



 5:19 pm on Oct 29, 2002 (gmt 0)

I do not the exact parameters for Google to decide on the language of a page or a site. It's however obvious Google takes a decision for only one language. Pages with mixed languages are therefore not an ideal solution.
The TLD is not involved. Google seperates results for languages first, TLD/hosting(?) second.
What is interesting in this context is you observation of links not counting in this regard. Should be an interesting option...

Does setting a meta lang have any real influence?
What influence has the title?


 5:26 pm on Oct 29, 2002 (gmt 0)


I actually had the metalanguage of the indexpage still set on Dutch from way in the begin.

I think I'll try a test page with a "no-language title" and 50% English and 50% Dutch body text and see what Google decides.

The point is that certain pages can be usefully multi-lingual, (certainly start-out geographical/language decision, index-like, pages) with good intentions without being confusing for the visitor.

Pity Google is not multilingual ;)


 7:05 pm on Oct 29, 2002 (gmt 0)

i was in a similar situation. i decided in favour of cloaking (i hope i use the word correctly?):

the webserver delivers the correct language page based on the user's language settings (browser). if the user is german he gets the german page. if not he gets the english page.

so the googlebot always gets the english version of www.myserver.com. form there there is a link to www.myserver.com/ger. all other pages have the form www.myserver.com/eng/xxx and www.myserver.com/ger/xxx.

the indexpage has PR6, the /ger version has PR5.

is that the best setup i can achieve or is there a better solution? do i achieve enough "weight" for my german pages?


 7:32 pm on Oct 29, 2002 (gmt 0)

Hi everyone, you started an interesting discussion about multilingual websites. I have one in English and later added a German site. First I had both languages in one physical web site. I separated it later in two different physical web sites, sharing the same database. Each version has its own IP address and the TLD's are .com and .de. File names and directory structure of the two sites are identical. I am doing pretty well with Google with PR6 for the homepage and PR6 for about 20 more pages and with PR6 for the German homepage and PR5 for the pages on second-tier level. The German site fetches about 50% of the traffic of the .com site. Two separate versions give me a lot of possibilities for SEO I would not have otherwise. For instance I can target the English page for the keyword combination "firstname lastname" and the German twin site for "lastname firstname".

Muesli, I would not force anyone to use the German version. I discovered that my clients may find my German site first, but when it comes to do serious things on my web site like buying, most Germans prefer the English version.


 4:51 am on Oct 30, 2002 (gmt 0)

How are webmasters in Canada doing this?

Do you just go for one language for your index page?

Same goes for other multilingual countries:

Belgium, Luxembourg, Switzerland etc.


 10:28 am on Oct 30, 2002 (gmt 0)

Muesli, I would not force anyone to use the German version. I discovered that my clients may find my German site first, but when it comes to do serious things on my web site like buying, most Germans prefer the English version.
i don't. on the german version there is a link to myserver.com/eng as well, so users have the choice. anyway my targetgroup are teenagers, they prefer to shop in a language they really control.

 10:32 am on Oct 30, 2002 (gmt 0)

Obviously presenting content in different languages on dofferent sites, however this is done technically, is by far the preferable solution.

Nevertheless, the original question stands: what criteria exactly does Google, and any other engine, use to set the language of a page and of a site?

I will eventually make some experiments and would be very interested in other peoples experiences.


 10:46 am on Oct 30, 2002 (gmt 0)

== Do you just go for one language for your index page?

Belgium ==

Here in Belgium we have 3 official languages: dutch, french and german. Plus English in addition... So 4 languages in total.

When I design a site I always start in the Dutch page (main language) with 3 flag type links in the top most right part of my site.

Most visitors know that a flag means a translation so I don't lose them because they started with the Dutch page...
A boring splash screen is a usability no-no...



 9:45 pm on Oct 30, 2002 (gmt 0)

> Does anyone know how Google classifies the language of a page?
> Is it on percentage bodytext? title? majority of language of inbound linked sites?

A short test with different Google "advanced search" language selections led me to the conclusion that Google primarily seems to follow my
<meta http-equiv="content-language" content="xx">
tag, and consequently doesn't show a multilingual page with both Xx and Yy text, when language "yy" is selected in the "advanced search" but the page's meta tag said "xx".



 9:19 am on Oct 31, 2002 (gmt 0)

Vitaplease, Heini

Our page is in 7 languages and when I use the "pages in your language" option our page shows up in the correct language every time.

The index page is English and this does not show when I carry out a foreign language search, but Google does pull up lower level pages which are in the chosen language.

We don't set any language meta tag. In fact the only language we tell the browser about on our home page is Javascript :)


 3:32 am on Nov 1, 2002 (gmt 0)


First post under this new name cause I got kicked out for not using the other one :(

I have a website in 5 languages, but only three of them (English, Dutch and French) have been spidered this time. I updated too late for this month update.

When I do a search in English using let's say google.fr my site shows as second for my main keyword, when i do the same under google.com/en it's nowhere to be found.

How come?

I am Belgian too (French speaking) but I didn't opt for any one language on the index page. I put all five languages on the index of the root (obviously I can't have that many languages in the title tag) and each language homepage is under /en /de /fr /es /nl. Anybody doing like this?

I managed to get three Dmoz listing in a month pointing to /en /nl and /es. Been trying to get in the French version to no avail, the editor must think 3 is enough (the previous company I did SEO for managed to get 7 dmoz listing).

I didn't use the language meta for this update, just put it now to see if this would improve the situation.

I thought that the reason I am not showing up in English was maybe that my PR was not good enough, but why I am 2nd in google.fr typing English keywords.

Any hint is higly welcome


 11:53 am on Nov 2, 2002 (gmt 0)

Welcome back to WebmasterWorld, multilang. Your problem sounds slightly different. I suspect that for some phrases, the English language search gives priority to pages that seem to be hosted in the matching country. I guess that when you go to google.fr your interface language is non-English so this filter isn't applied.

This is an interesting thread, I'll be very interested in heini's and vitaplease's observations as their pages get listed.


 12:37 pm on Nov 2, 2002 (gmt 0)

We support 19 languages and have placed each one into its own domain, using local domains wherever allowed (.at, .be, .dk, .jp, .li, .lt, .lv, .com.pt, .co.ru) and dot-coms everywhere else.

Each site has its own entry in ODP World (this is not just allowed, but actually encouraged by ODP). Inside the sites, each page is linked to the corresponding page in the other 18 languages, using a row of flags across the top. So if someone stumbles upon a page in his/her second or third language, the visitor can easily switch to the home language with just one click. This is an important and customer friendly approach, since many non-English speaking people will simply assume that there is nothing available in their own language and search in English. I am told this is true even of Germans, who speak Europe's biggest language.

All sites rank extremely well in Googles all over the place. In those cases where the search is for a brand name or a phrase that is the same in several languages, we usually have all, or almost all of the sites rank among the top 30. The traffic from Google is so intense, that we could actually do well without any other engines.


 4:38 pm on Nov 2, 2002 (gmt 0)

Obviously presenting content in different languages on dofferent sites, however this is done technically, is by far the preferable solution.

what is the second best?

there may be very good reasons that don't allow to split a site into different domains per language. in my case:
- licence contract with content supplier for my shop that doubles the licence per domain
- german web traffic auditing authority charges very expensive per domain
- i want to create a strong "brand" using the .com address

is the "cloaking" solution i described earlier the second best solution?


 5:32 pm on Nov 2, 2002 (gmt 0)

I thought that the reason I am not showing up in English was maybe that my PR was not good enough, but why I am 2nd in google.fr typing English keywords.

Hi Multilang. At a guess, I'd say yours is a travel site? Google seems to rank travel sites differently on local Googles. they seem to give less emphasis to incoming links and more emphasis to the words on your page. I guess this is because they think local sites are likely to have less pagerank than big international sites, and they therefore play down the pagerank effects for travel sites (on local googles) to give local sites a boost.

Rencke - Aren't you worried about the cross-linking penalty? 19 sites all linked to each other sounds high risk to me, even if you aren't doing it to get a pagerank boost. Google might wipe you from the face of the database if they have a bad hair day.

Heini - local domains are not always possible to buy unless you have an operation in that country.


 5:43 pm on Nov 2, 2002 (gmt 0)

I'll be having to do the same with spanish/english and german/english soon so this thread is in the right place and time for me, very interesting.

I was thinking of using the index page to welcome and put links to all languages there (perhaps with flags to indicate)
The 'second page in' would then be set-out to accomodate language and relevant seo. Any thoughts┐


 5:56 pm on Nov 2, 2002 (gmt 0)

Hey Vitaplease, looks like we will have to solve the question what params Google uses to define the language of a page in another thread ;)

So: strategies for multilingual sites:

Look at Rencke's message to see the optimum at work. It's way cool and works for users and engines alike. If you're concerned about the interlinking issue you might put links up as redirect urls.

Second best: Do the same thing but use .com TLDs. Easily obtainable for everybody, up to now only a fewplaces/search options where those get filtered out. For most localized searches the language is the decisive factor.

Third best: subdomains. Might not even be all that bad if used correctly and optimized to the fullest. On the plus side you end up with a huge site and many links. Getting links from local directories is still possible. Make sure to get a keyword in local language for the subdomain.

Fourth best: different languages on different pages in one domain.


 6:13 pm on Nov 2, 2002 (gmt 0)

Let me add a new twist...

I get numerous requests for a link exchange from Non-English sites - (Japanese, Spanish, Russian, German, etc.)

I never know if I should accept their invitations since I do not know if GOOGLE counts PR from non-English sites.

Is it beneficial from the PR point of view to exchange links with the foreign sites (some of them have pr 6 and higher)?

Some people tell me PR is PR , it does not matter whether you get it from an English site or a Russian site.

What do you think?


 6:18 pm on Nov 2, 2002 (gmt 0)

>What do you think?

Take the links.


 6:24 pm on Nov 2, 2002 (gmt 0)

>GOOGLE counts PR from non-English sites
Erm - how do you think non english site acquire their PR then? ;)

No, really, a link is a link. The only problem you could have with that would be the themeing aspect. But then nobody knows for sure how important the theme of the linking page is for the weight of the link in ranking. The pure PR factor of course is independant from language issues.

An interesting thing here is this: Getting links from foreign language sites, with your keywords in that foreign language can bring Google to rank you for those foreign language keywords even if they are nowhere on your site!


 6:45 pm on Nov 2, 2002 (gmt 0)

I'm not worried about themes. They all fit NICELY with my site. :)


 7:49 pm on Nov 2, 2002 (gmt 0)

The only problem you could have with that would be the themeing aspect


would it not be nice if Google used something similar to the translation programs around that have an identical unique number id for every meaning of the word in different languages.


that would make intercultural Googlebombing possible..

Well you cannot expect every thing from Google, yet :)

I never know if I should accept their invitations since I do not know if GOOGLE counts PR from non-English sites.

Gregory, sticky me the examples if your in doubt ;)


 9:14 pm on Nov 2, 2002 (gmt 0)

Vitaplease, I have no idea really what the linguists are working on at Google, or at Fast (BTW: they still have that Albert connection, wonder what's coming out of that one day).
But I certainly expect them to take steps in that direction. The automatic spelling corrections are surely just the tip of what's being developed behind the scenes.
Multilingual sites should make for perfect objects for studies.


 9:36 pm on Nov 2, 2002 (gmt 0)

Albert connection?


 10:17 pm on Nov 2, 2002 (gmt 0)

Albert [albert.com]

"a multi-lingual, natural language capable, intelligent search engine" [arnoldit.com]

Fast PR [web.archive.org]

I believe Albert technology has to a degree gone into Fast Datasearch, their highly successfull corporate search solutions.

Anyway - the question is what possible rewards would search engines have from exploring the worldwide multilingual web and integrating all content into one logic structure? It's a great goal for scientists - but I don't see the money in it.


 10:24 pm on Nov 2, 2002 (gmt 0)

thanks Heini,

nice round-up.

I printed the white-paper for some good bedside reading.


 10:25 pm on Nov 2, 2002 (gmt 0)

Hi SlyOldDog

My site is not about travel it's about translation.
Apparently since Google transferred the update from www3 to www it got stable and the keyword shows at the same position be it under .fr or .com

I also got a PR 5 directly from the 1st update.

In my experience high PR is a must for English sites where competition is fierce. I am promoted a bunch of sites in Spanish for instance ang easily reach the top 3 for any keyword I want. It's fairly easy to optimize for certain languages.

IMHO having all the languages under the same domain helps for the general PR of the whole site since you can get incoming links from a lot more sites.

As to the use of flags to indicate there are other languages available, I don't quite like this because a flag shows geographical limitation, for English you can put a British and/or an American flag but what about other countries? What do you do with Spanish which is spoken in many countries? Showing a flag of Spain is not ok since in my opinion is doesn't take into account other coummuties. Same with the flag of France, French is also spoken in Switzerland, Belgium, Quebec and many African countries.

I personally always opt for English, Franšais, Espa˝ol, Deutsch as links ... besides it's a keyword in itself :)


 2:50 pm on Nov 3, 2002 (gmt 0)

Aren't you worried about the cross-linking penalty?

Not in the least. The approach is very user friendly and the multiple entries in Google SERPs for search phrases common to more than one language actually produces a better search result from Google's point of view, than the opposite. So they'd be silly to penalize this kind of setup as long as only a small minority of their users bother to set language preferences, filtering away languages they don't understand. And I don't think they are silly. If I am wrong about that, I am fully prepared to argue my case with Google support.

One day, in this decade surely, they will figure out a way to guess what languages are likely to be preferred by the user and in which order, and rank accordingly.

Another reason for the one language per site solution: There are numerous language dependant local directories, some of which have a strong standing in their comminities. Case in point: AllesKlar.de, never to be forgotten if you market to the 100 million Europeans who have German as their mother tongue.


 7:56 am on Nov 4, 2002 (gmt 0)

I'm still - alas - confused. Are, for instance, pages on a single domain which are identical except for being in different languages, seen by Google as being duplicate content?

If they're not, then what's the downside of having my different language versions on a single domain with www.mysite.com/eng/ and www.mysite.com/fr/ etc

This 37 message thread spans 2 pages: 37 ( [1] 2 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved