Forum Moderators: open

Message Too Old, No Replies

How to declare content language

Is it en-UK or en-GB?

         

lbobke

10:05 am on Nov 21, 2004 (gmt 0)

10+ Year Member



I'm trying to write the English version of my website in British English (I'm from Germany myself) and would like to declare this correctly in the HTML-tag.

Now, I have come across two versions of this tag:
<html lang="en-GB"> and
<html lang="en-UK">
obviously referring to "Great Britain" and the "United Kingdom".

Is any of the above versions incorrect? Which one will be better interpreted by browsers and search engines?
Is it advisable to also use one of the following meta tags?
<meta http-equiv="Content-Language" content="en-gb">
<meta name="Language" content="en-UK">

My sites are hosted in Germany and I would like the correct version to come up when someone is restricting a search to German, English or Spanish content.

Laurenz

encyclo

12:45 pm on Nov 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



British English is
en-GB
, and the best way is to add a
lang
attribute to the
html
tag or send the language within the HTTP header.

However, unless you are using several different versions of English (eg. US English, Canadian English), it is much better to just use the two-letter language code

en
. This is because if you are using content negotiation in the future, then someone who has, say,
en-US
as their preferred default, then a page declared as
en-GB
would not qualify, and they may end up with another language showing first.

So

<html lang="en">
is the way to go.

Mr Bo Jangles

12:57 pm on Nov 21, 2004 (gmt 0)

10+ Year Member



encyclo, supplementary question if I may.

My web site is English primarilly, but has translations in 6 other languages, as subdirectories, e.g.:
the German is loacated:
www.acmewidget.com/deutschland/index.htm

In the html for the translated pages, I have (again using the German as an example):
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 //EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="de">
<head>
<meta http-equiv="Content-Language" content="DE">

I realise the lang="de" and the meta all seem to do the same thing, but I haven't seen anything that suggested it was a 'no-no' to include both, so I did. BUT, we seem to have a THIRD (and incorrect) language reference in the very top line re the DocType - so, should this be DE as well? And that now makes three language references!

Advices?

bill

1:01 pm on Nov 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The DocType language has nothing to do with the language of the page, so it's not incorrect as you state it.

Mr Bo Jangles

1:07 pm on Nov 21, 2004 (gmt 0)

10+ Year Member



So Bill, I *think* I've seen a DocType page that had 'fr' instead of 'en' - so what is the import of the designation? Is it not really of any/much import?

bill

1:14 pm on Nov 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's the language that the DTD was written in. I'm sure you could write a custom one, but I've never dealt with any that weren't in English. I use the same DTD on Japanese and Chinese sites, so it really won't make a difference in your case either.

The DTD language has nothing to do with the page's charset encoding or language meta tag.

lbobke

8:26 pm on Nov 21, 2004 (gmt 0)

10+ Year Member



Bill,

sorry, but I have to disagree, Before coming here, I had a look at w3.org. That's how they see this:

8.1 Specifying the language of content: the lang attribute
Attribute definitions
lang = language-code [CI]
This attribute specifies the base language of an element's attribute values and text content. The default value of this attribute is unknown.
...
In this example, the primary language of the document is French ("fr"). One paragraph is declared to be in Spanish ("es"), after which the primary language returns to French. The following paragraph includes an embedded Japanese ("ja") phrase, after which the primary language returns to French.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML lang="fr">
<HEAD>
<TITLE>Un document multilingue</TITLE>
</HEAD>
<BODY>
...Interpreted as French...
<P lang="es">...Interpreted as Spanish...
<P>...Interpreted as French again...
<P>...French text interrupted by<EM lang="ja">some
Japanese</EM>French begins here again...
</BODY>
</HTML>

ht*p://www.w3.org/TR/REC-html40/struct/dirlang.html

So, as I understand it, the language specified should be the language the main part of the document is in.

The question only is: how much precision is needed - and what is the correct way to tell a browser/spider that a page was written in British English?

Laurenz

bull

10:19 pm on Nov 21, 2004 (gmt 0)

10+ Year Member



not new.

[webmasterworld.com...]

[edited by: bull at 10:20 pm (utc) on Nov. 21, 2004]

encyclo

10:19 pm on Nov 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



lbobke, bill is quite right, but I think you misunderstood the point he was making. From your example:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//[b]EN[/b]"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML lang="[b]fr[/b]">

The "EN" in the doctype refers to the language of the DTD, not the language of the document. The "fr" refers to the document's language.

Mr Bo Jangles: your example looks fine: it is rather redundant to specify the language in a meta element when already specified on the

html
element, but it does no harm.

tedster

11:17 pm on Nov 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's also a smart idea to declare the content-language (and character set) in the HTTP server header.

larryhatch

3:49 am on Nov 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello Tedster:

Could you give an example of how to declare language
in the HTTP server header?

I have a few pages translated into French and Spanish.

The first 2 lines of one of these read:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html lang="fr">

What exactly is the 'server header'?

I just now added the lan="fr">. Good advice!

What else do I need to do, and how?

- Larry

tedster

5:20 am on Nov 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



These goodies are not in your HTML document. HTTP header fields work behind the scenes to help the browser and server exchange general information about the KIND of content involved. So the server header is intial data that a server sends out BEFORE your html, and your browser also uses HTTP to communicate its requests and requirements to the server.

With HTTP 1.1, content negotiation of many kinds is possible between the user agent and the server (the server says 'I can give you English, Spanish and French', user agent says 'I prefer German and English', etc.) but many of these features are currently under-utilized. That's why I said it's a good idea, but it's certainly not a required step at present.

Also, you may not be able to change these headers if you are on shared hosting and have minimal access to server configuration. Although I currently know very little about PHP, I understand you can also set header information with this technology.

You can check your HTTP header information
on SearchEngineWorld, our sister site:
Server Header Checker [searchengineworld.com]

You can learn more about HTTP headers at the W3C
Header Field Definitions [w3.org]

larryhatch

6:09 am on Nov 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Tedster!

I checked my header info on the link provided.
It all looks good to me, but nothing about language.

I presume I'm on a shared server, an inexpensive hosting account.

For now then, I presume that a line like <html lang="fr"> is sufficient.
Previously I just had <html>.

May I presume that <html> defaults to English, or should I add lang="en" to my many english pages?

Best - Larry

larryhatch

6:28 am on Nov 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Another question!

Will putting <html lang="fr"> into my French language pages actually help me in the SERPS for french terms?

I have a [topic] related site. If I Google for the English keyword, my pages start listing between #22 and #35.

If I Google for the french/spanish/italian word, I'm way out in the boondocks, 100th place or so.

Once lang="fr" and/of lang="es" kicks in, i.e. those
pages get crawled and re-indexed, might I expect better placement for the other languages?

Is there any reasonably good evidence that lang="xx" helps the SERPS of pages written in those languages?

Thanks again - Larry

[edited by: tedster at 7:16 am (utc) on Nov. 22, 2004]
[edit reason] remove specifics [/edit]

tedster

7:19 am on Nov 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, lang="en" is the assumed default... for today least.

I have seen no evidence that a lang attribute helps in the SERPs for any language, but your description sounds quite suggestive. I'd say try it without making any other changes and let us know how your test turns out. You're in a good situation to discover something very easily.