Forum Moderators: open

Message Too Old, No Replies

Google and Language search

         

yizmo

9:45 am on Oct 29, 2003 (gmt 0)

10+ Year Member



If I surf to Google.nl, there is a possiblity to choose from "search the web" or "search in dutch pages"

How does Google know if a page is Dutch or has Dutch language on it. I can't seem to get any pattern in it.

Google doesn't only search in .nl pages, it also shows .com, .net, .org and .be pages with dutch content.

These websites do not have <META NAME="Language" CONTENT="NL"> in their source. This metatag is worthless, like the revisit-after tag.

Another question: I i surf to google.co.uk, there is a possiblity to search in pages from the UK, but how does Google know if a page is from the UK, because there are also .com and .net results. For example. I you search for "keyword" you could get a result: domain.com. If you look in NetworkSolutions.com. The WHOIS will give an company from the UK.

Does google also have a WHOIS field in its database?

[edited by: vitaplease at 10:06 am (utc) on Oct. 29, 2003]
[edit reason] removed the specifics [/edit]

kaled

11:22 am on Oct 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The IP address can be used to determine country. Indeed, A website offering geolocation software traced me to within about 1 mile.

Google may keep country information based on the server IP, however this is simply a guess.

Kaled.

heini

11:28 am on Oct 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



yizmo, welcome to WebmasterWorld!

There's 2 different matters, filtering for pages in a specific language, and filtering for pages from a country.
The first is done with automatic language detection. TLDs have nothing to do with it.
The second is based upon TLD and hosting IP.
For both filters Google does not take meta info into account.

Ivana

11:31 am on Oct 29, 2003 (gmt 0)

10+ Year Member



Well, when I search for pages only in Danish, I also get results in Norwegian. Although the languages are very similar, there doesn't seem to be much point in it.

I would also be very curious to know how the language is determined. I don't think that it is decided by IP though, because sites aren't neccesarily hosted in the country where the particular language belongs. Also, English is mother tongue in more than one country.

Ivana

11:32 am on Oct 29, 2003 (gmt 0)

10+ Year Member



heini, I didn't see your post coming.

What is TLD?

yizmo

12:06 pm on Oct 29, 2003 (gmt 0)

10+ Year Member



TLD stands for "Top Level Domain"

for a list of TLD's see
[iana.org...]

heini

12:08 pm on Oct 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ivana, automatic language detection. The issues with Danish/Norwegian are not new, obviously Google's linguistic department has some problems there.
So for all languages apart from the major languages I recommend adding the appropriate meta tag, can't say if it 100% prevents Google from messing it up though.
TLD=TopLevelDomain, i.e. .com, .net. .org, .dk, .es etc.

kaled

2:58 pm on Oct 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just ran a few quick tests on my site - a .net domain hosted in the uk.

A few searches on google.co.uk indicate that there is a large bias towards my website. There are only two possible clues to location that could explain this. A postal address on the contacts page and the IP address of the server.

It seems likely (to me) that Google does use server location as a significant part of its language/location filtering.

Kaled.

Hagstrom

2:32 pm on Oct 30, 2003 (gmt 0)

10+ Year Member



Google doesn't only search in .nl pages, it also shows .com, .net, .org and .be pages with dutch content.

You'll also find pages with Middle Low German content. Automatic language detection is far from perfect ;)

willamowius

11:03 pm on Oct 30, 2003 (gmt 0)

10+ Year Member



The trouble is that Googlebot ignores <meta http-equiv="Content-Language" content="XY">.

Is there any other hint one can give to Googlebot to get it right? I have dozens of pages where the language detection went astray...

Hagstrom

1:56 pm on Oct 31, 2003 (gmt 0)

10+ Year Member



I doubt that. I use <html lang="en"> but this is also ignored by Google.

I don't think Google trusts us enough ;)

2_much

5:05 pm on Oct 31, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are there figures posted anywhere about how many people have the translated versions of Google as their default search as opposed to using google.com?

heini

5:26 pm on Oct 31, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>figures
Not that I'm aware of. I think this would differe hugely from country to country and language to language.

There are linguistic factors coming into play as well as cultural factors.

In this discussion [webmasterworld.com] we were talking about what I see as growing trend at least in Germany towards local searching, i.e. searches restricted to local language or even results from local country.

Google would know...