What <html lang=""> attribute value should I use for a mixed language page?

As far as I can tell from reading the HTML5 spec the lang attribute:

value must be a valid BCP 47 language tag, or the empty string

Source: http://www.w3.org/TR/html5/dom.html#the-lang-and-xml:lang-attributes

There's no mention in the spec of an array of language strings and every example I've found uses a single language string.

This makes sense since really a given section can only be in one language unless we're creating a new hybrid language.

Since the lang attribute is valid on all HTML elements you can wrap your language specific code in a new tag in order to indicate its language.

<html lang="en">
[...]
<body>
<h1>I am a heading <span lang="de-DE">Eine Überschrift</span></h1>
</body>
</html>

As I understand it you should be able to use <html lang="mul"> to indicate Multiple languages.

Choose subtags from the IANA Language Subtag Registry.

Source; https://www.w3.org/TR/2007/NOTE-i18n-html-tech-lang-20070412/#ri20030112.224623362

There is a subtag in the list named Subtag: mul

Source: http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

However I don't think you will be able to specify exactly which languages you're mixing in the html element. However, as Jamie wrote, you can specify different lang attributes for different elements at the page.

There do exist four special language codes within ISO 639-3 and all of them are also valid within the IANA subtag registry; https://en.wikipedia.org/wiki/ISO_639-3#Special_codes

However, I doubt this have good support from search engines as Google.


Adding this answer in April 2020 to provide the latest guidance from the W3C (W3.org) ...

Firstly, no, you cannot use <html lang="lang1 lang2"> since it will not validate properly. This is the result when validating via the W3's Nu Html Checker with more than one language (English and Swahili) in the language attribute of the html tag. This error will result with or without comma(s):

Error: Bad value en fr for attribute lang on element html: The language subtag en swh is not a valid language subtag.

<html lang="en swh">↩</html>

Below is the latest based on the W3C's Declaring language in HTML if you want to declare the language of polyglot web pages with more than one language:

QUICK ANSWER

Always use a language attribute on the html tag to declare the default language of the text in the page. When the page contains content in another language, add a language attribute to an element surrounding that content.

Use the lang attribute for pages served as HTML, and the xml:lang attribute > for pages served as XML. For XHTML 1.x and HTML5 polyglot documents, use both together.

Use language tags from the IANA Language Subtag Registry. You can find subtags using > the unofficial Language Subtag Lookup tool.

Use nested elements to take care of content and attribute values on the same element that are in different languages.

What if element content and attribute values are in different languages?

In the image below from the W3C's site, the link text shows the language of the target page (Spanish) using the language of the target page ("Español"), but an associated title attribute contains a hint in the language of the current page ("Spanish" in English):

enter image description here

The markup for the above should look like follows, where the span element inherits the default en setting of the html element:

<span title="Spanish"><a lang="es" href="qa-html-language-declarations.es">Español</a></span>

What if there's no element to hang your attribute on?

If you want to specify the language of some content but there is no markup around it, use an element such as span or div around the content. Here is an example:

<p>You'd say that in Chinese as <span lang="zh-Hans">中国科学院文献情报中心</span>.</p>

How can you specify metadata for more than one audience language?

Get the server to send the information in the HTTP Content-Language header. If your intended audience speaks more than one language, the HTTP header allows you to use a comma-separated list of languages.

Here is an example of an HTTP header that declares the resource to be a mixture of English, Hindi and Punjabi:

Content-Language: en, hi, pa

Note that this approach is not effective if your page is accessed from a hard drive, disk or other non-server based location. There is currently no widely recognized way of using this kind of metadata inside the page.

In the past, many people used a meta element with the http-equiv attribute set to Content-Language. Due to long-standing confusion and inconsistent implementations of this element, the HTML5 specification made this non-conforming in HTML, so you should no longer use it.

See these links for the details:

Tags:

Html

Lang