What is the difference between C.UTF-8 and en_US.UTF-8 locales?

There might be some impact as they differ in sorting orders, upper-lower case relationships, collation orders, thousands separators, default currency symbol and more.

C.utf8 = POSIX standards-compliant default locale. Only strict ASCII characters are valid, extended to allow the basic use of UTF-8

en_US.utf8 = American English UTF-8 locale.

Though I'm not sure about the specific effect you might encounter, but I believe you can set the locale and encoding inside your application if needed.


Here are some reasons why I added LC_TIME=C.UTF-8 in /etc/default/locale, in case it helps someone:

It provides a 24-hour clock instead of AM/PM in Firefox for HTML5 input type=time (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/time) and uses a datepicker in the format DD/MM/YYYY instead of MM/DD/YYYY for HTML5 input type=date (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/date).

It allows to use YYYY-MM-DD international date format (ISO 8601) with a 24-hour clock when replying to emails in Thunberbird.

Previously, it was possible with LC_TIME=en_DK.UTF-8 (http://kb.mozillazine.org/Date_display_format) but there is a bug currently and it stopped working (https://bugzilla.mozilla.org/show_bug.cgi?id=1426907#c155).

Edit: Now even the LC_TIME=C.UTF-8 workaround does not work for Thunberbird (https://bugzilla.mozilla.org/show_bug.cgi?id=1426907#c197) but at least en_IE.UTF-8 provides the European date format DD/MM/YYYY instead of MM/DD/YYYY.


In general C is for computer, en_US is for people in US who speak English (and other people who want the same behaviour).

The for computer means that the strings are sometime more standardized (but still in English), so an output of a program could be read from an other program. With en_US, strings could be improved, alphabetic order could be improved (maybe by new rules of Chicago rules of style, etc.). So more user-friendly, but possibly less stable. Note: locales are not just for translation of strings, but also for collation (alphabetic order, numbers (e.g. thousand separator), currency (I think it is safe to predict that $ and 2 decimal digits will remain), months, day of weeks, etc.

In your case, it is just the UTF-8 version of both locales.

In general it should not matter. I usually prefer en_US.UTF-8, but usually it doesn't matter, and in your case (server app), it should only change log and error messages (if you use locale.setlocale(). You should handle client locales inside your app. Programs that read from other programs should set C before opening the pipe, so it should not really matter.

As you see, probably it doesn't matter. You may also use POSIX locale, also define in Debian. You get the list of installed locales with locale -a.

Note: Micro-optimization will prescribe C/C.UTF-8 locale: no translation of files (gettext), and simple rules on collation and number formatting, but this should visible only on server side.