Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Multi-regional sites

         

doc_z

1:45 pm on Jul 13, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One of Google's main problems is determining the region/country of a website. Google's first criterion is the top-level domain. For country-code top-level domains (ccTLD), the assignment works well. In all other cases it is more or less bad to catastrophic and completely wrong.

Multi-regional websites that are managed on a com domain, for example, can be a nightmare.

I see domains that use the hreflang tag and html lang correctly and specify a country, but Google still assigns the country incorrectly. Conversely, I see domains that have no country targeting at all, but are assigned to a country by Google. Just one example: if you search for pages from Switzerland on wikipedia.org or wiktionary.org, some pages are displayed only because they are on a subdomain "ch." subdomain! These pages have no connection to Switzerland and "ch." does not stand for Switzerland. In other cases, Google only categorises them because "/CH/" appears in the URL.
[google.de...]

The whole thing is not only annoying or tedious, it also ties up resources and costs money.

Communication with Google on this topic is hopeless.

Google could easily remedy this:

1. publish a document that clearly and unambiguously describes the criteria according to which a page is assigned to a country. This could look like this: 1) country-code top-level domain 2) html lang 3) hreflang tag etc.

2. provide an official test tool where you can enter a URL and then see which country Google has assigned to the page and based on which criteria.

This is not complicated or a lot of work.

Alternatively, Google could reactivate the old Search Console tool, where the website operator can make a manual assignment. (https://support.google.com/webmasters/answer/12474899?hl=en)

However, Google does neither one nor the other. Google obviously doesn't care about the collateral damage either… Instead, instructions are published that do not work even if all recommendations are followed: [developers.google.com...]

"Ceterum censeo Google esse delendam"

not2easy

2:02 pm on Jul 13, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



These pages have no connection to Switzerland and "ch." does not stand for Switzerland.

.ch is the country code TLD designation for Switzerland, read it here: [iana.org...]

doc_z

3:45 pm on Jul 13, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



.ch is the country code TLD designation for Switzerland, read it here

Of course .ch is the TLD of Switzerland! But it's not a TLD, it's a subdomain.

A subdomain "ch." does not mean that the subdomain has a connection to Switzerland. Specifically, the examples of ch.wikipedia.org and ch.wiktionary.org are about the Chamoru language.

Google only makes the mistake of using the "ch." subdomain as a TLD and assigning it to Switzerland. This is completely wrong.

That was just an example. I have seen a lot of incorrect country assignments.

doc_z

8:27 am on Jul 17, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google does not even manage to correctly assign countries to its own pages. You can simply search Google for “site:google.com” and restrict the results to Austria or Switzerland.

engine

9:02 am on Jul 17, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It sounds to me the allocation G is making is not based on country, but is based on language, with the general assumption in those examples of German speaking language.
Yep, it's not great.

Whitey

9:07 am on Jul 17, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I’m not sure how to trust the site: command. There’s been several Googler’s saying not to rely on it over the years, but old habits prevail for me.

A search on a language folder site:ourwebsite.com/el/ (Greek) surfaced El Salvador results among legit url results


[edited by: not2easy at 11:49 am (utc) on Jul 17, 2024]
[edit reason] de-smileyed [/edit]

doc_z

8:38 pm on Jul 18, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It sounds to me the allocation G is making is not based on country, but is based on language, with the general assumption in those examples of German speaking language.

And if you have different versions of the same language - such as German for Germany, Austria and Switzerland - then you will have massive problems. Many users are sent to the wrong version, which increases bounce rates, for example. Or Google sorts out versions completely - you can then see this very nicely in the GSC (e.g. ‘Alternate page with proper canonical tag’).

Currently, I can only warn everyone against creating country versions with the same language on a single domain - even if it is officially supported by Google. It can go well, but it can also end in absolute chaos without you having done anything wrong and without being able to correct it yourself.

I’m not sure how to trust the site: command.

If you don't trust the site command, you can also test it with a search for a phrase in quotes - the result won't get any better.

lucy24

12:34 am on Jul 19, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is the same Google, isn’t it, that has for years prided itself on ignoring the "lang" tag wherever it appears. So what can you expect.

doc_z

11:14 am on Jul 19, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Funnily enough, Google also knows that the current system does not work. (Gary Illyes: "hreflang is annoying. I don't disagree. I'm still very open to coming up with something less annoying, but it needs to work for small sites and mammoths as well, while delivering at least the same amount of information.") Nevertheless, they ignore all suggestions.

Whitey

1:04 am on Jul 20, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Nevertheless, they ignore all suggestions

@doc_z What are the best suggestions that you have seen or would want that can be proposed?
This is the same Google, isn’t it, that has for years prided itself on ignoring the "lang" tag wherever it appears. So what can you expect.

It would be good to get some feedback from Google's search liaison on how they are dealing, or not dealing with this age old problem, so a shout out to @rustybrick over at SEORoundtable (if you're watching and think that you can extract some better feedback from G to help).

We're going through a multi language upgrade at the moment, and as part of that process matching the indexing of multiple languages to product in countries e.g. Switzerland = language DE / IT / FR / EN etc etc + applying hreflang tags. In the SERP's I'm seeing a real mix up of results, as you say, but I'm not sure how much the IP of the user plays into it. Since I'm not an SEO, I have to rely on the feedback from others, observations and our own SEO's. G's feedback would be appreciated.

[edited by: engine at 8:25 am (utc) on Jul 20, 2024]
[edit reason] Fixed typo [/edit]

Whitey

4:54 am on Jul 20, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@doc_z What are the best suggestions that you have seen or would want that can be proposed?

I guess your OP covers this after some careful thought.

doc_z

6:43 am on Jul 20, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What are the best suggestions that you have seen or would want that can be proposed?


The simplest solution is for Google to reactivate international tagging in the Search Console.

The technology exists, as does the documentation. In addition, many people are still familiar with the system.

It is important that the settings made there have priority. This means that the criteria for determining the region have a clear order of priority:
1) ccTLD
2) International Tagetting in the Search Console
3) Everything else

The settings made by the webmaster in the Search Console (country, language) must always take precedence over the values determined by Google. Otherwise it would be pointless.

A folder such as /de-at/ could simply be manually assigned to a country and a language.

Problems with duplicate content due to the same content ("Duplicate, Google chose different canonical than user") should no longer occur. Even if the hreflang link should not work for some pages, no major problems or collateral damage (as is currently the case) would occur (with manual assignment). This would also solve the problem if Google ignores the lang tag.

doc_z

1:11 pm on Jul 20, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google will continue to support and use hreflang tags on your pages. However, the ability to target search results to specific countries using Search Console country targeting was determined to have little value for the ecosystem, and is no longer supported.

[support.google.com...]

We continue to support hreflang and our recommendations for managing multilingual and multiregional sites still stand.

[x.com...]

If you replace a functioning system with a non-functioning system, then you should at least recognise it at some point and undo the mistake.

Although I am almost certain that this will not happen.

The hreflang system also has conceptual weaknesses: it assumes that Google has crawled and indexed all versions of a page. This is certainly not a problem for small sites with few pages, but for large sites with hundreds of thousands of pages per language version and dozens of different versions. With the international tagging system, however, it is immediately clear which country and which language a page belongs to.

doc_z

9:05 am on Jul 21, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To show all the madness, I analysed Google's page on multi-regional and multilingual websites with regard to hreflang tags.
[developers.google.com...]

1) Google uses the same URL for ‘en’ and ‘x-default’. According to Google's documentation, it is not clear whether the same URL can be used several times or whether a separate URL must always be used even for ‘x-default’.
<link rel="alternate" hreflang="en" href="https://developers.google.com/search/docs/specialty/international/managing-multi-regional-sites" />
<link rel="alternate" hreflang="x-default" href="https://developers.google.com/search/docs/specialty/international/managing-multi-regional-sites" />


2) Google uses the sub-region code 419 for ‘Latin America’. It is doubtful whether the value is permissible according to ISO 3166-1 alpha-2.
<link rel="alternate" hreflang="es-419" href="https://developers.google.com/search/docs/specialty/international/managing-multi-regional-sites?hl=es-419" />


3) There are errors with other language versions such as ‘it’, as 21 instead of 20 cross-links appear there. There are two links with different URLs to ‘en’. This has the following code:
<link rel="alternate" hreflang="en" href="https://developers.google.com/search/docs/specialty/international/managing-multi-regional-sites" />

<link rel="alternate machine-translated-from" hreflang="en" href="https://developers.google.com/search/docs/specialty/international/managing-multi-regional-sites?hl=en">


What's more, the whole system doesn't work as soon as there's even one hitch. Furthermore, Google does not have an official test tool, nor does the Search Console provide helpful information about errors found.

It may not be a problem for Google if your own site has problems with the hreflang tag - for other sites, the correct recognition of the language and country is existential and the consequences dramatic.

engine

10:34 am on Jul 21, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I see many, many sites using the simple interstitial showing country flag. It puts the onus on the user to click the appropriate country flag.
As a user, I don't mind that approach.

Whitey

10:57 am on Jul 21, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As a user, I don't mind that approach.

Helpful enough, but if you are in a minor country speaking EN (67 countries), ES (21 countries) etc etc things can get tricky. I suppose it's possible to display a flag (or group of multiple flags) according to the users IP, but that's not perfect and tbh i haven't thought that through enough.

I wonder if translating content will become irrelevant in the future with browser based translation improvements across devices and search platforms. Any thoughts?

lucy24

4:49 pm on Jul 21, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



if you are in a minor country speaking EN (67 countries), ES (21 countries) etc etc things can get tricky
I think most residents of “minor countries” would recognize the flags of their languages’ originating countries, given the reason for the language. Even where the former colony is now bigger than the old country (looking at you, Brazil).

doc_z

2:52 pm on Jul 23, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Gary Illyes:
[...] now suddenly a bunch of unrelated reports coming in is kinda sus. hope I'm wrong cos if something is broken on our end, it's a nightmare to fix (site migrations are particularly hard to deal with thanks to spammers)

Let me summarise: Google is abolishing a functioning system and replacing it with a new system that has numerous problems. All advice is ignored, especially the suggestion repeatedly made by many people to reactivate the old system. Google is damaging entire websites and even destroying some of them. The nightmare, of course, is not that Google breaks websites and offers no solution or communication, but the nightmare is that Google has to fix the faulty system. No words.

Whitey

6:09 pm on Jul 23, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@doc_z can you reference the above quote to better understand the context. I couldn’t find it with a cut and paste search in G

doc_z

7:05 pm on Jul 23, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You can read all about it on Linkedin: [linkedin.com...]

lucy24

8:24 pm on Jul 23, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: idly wondering how many more decades must elapse before the present site recognizes the existence of UTF-8 ::

doc_z

6:48 am on Jul 24, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



[youtube.com...]

In my opinion the basic problem is that hreflang is NOT a directive for Google, but a signal and Google also uses other signals to determine country/language. However, they do not say which other signals they are and in which order.

This means that if the assignment to languages & countries does not work, there is no indication as to whether it is due to the implementation of the hreflang or to other signals and certainly not to which signals.

If Google saw hreflang as a directive and provided an official test tool, that would be a big step forward. (One can still dream.)

The only option that is currently safe is to use country-code top-level domains. If you have just switched from country-code top-level domains, you are lost...

doc_z

9:46 am on Jul 26, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Incidentally, it is particularly critical that Google only takes the hreflang as a signal, but at the same time also evaluates the server location as a signal. In the document “Tell Google about localized versions of your page”, details such as https://example.com/en-gb, https://example.com/en-us, https://example.com/en-au are explicitly mentioned as examples for the use of the hreflang tag. However, the choice of subdirectories means that it is always the same server location. This would not be critical if the hreflang was regarded as a directive for gTLD domains or if the server location was completely ignored when using the hreflang tags. Of course, this is exactly what Google does not do - after all, Google knows better than the website operator what is correct. The result: duplicate content and the removal of versions as you can see in the Search Console ("Duplicate, Google chose different canonical than user").

[developers.google.com...]

And in 2024, using server location as a factor for website targeting is also a joke.