How can I manually determine the CodePage and Locale of the current OS

Solution 1:

chcp will get you the active code page.

systeminfo will display system locale and input locale, among other things.

"Note: This command (systeminfo) is not available in Windows 2000 but you can still query Windows 2000 computer by running this command on Windows XP or Windows 2003 computer and set remote computer to Windows 2000 computer. If the current user logon that execute this command already has privilege on remote machine (for instance, Domain Administrators), you don’t have to use /u and /p."
From here.

Solution 2:

Note that a given system has two active code pages of interest, as determined by the legacy setting named language for non-Unicode programs, formerly known as system locale (see the bottom section for background information):

  • the OEM code page for use by legacy console applications,
  • the ANSI code page for use by legacy GUI applications.

Note: There are two more code pages, but they are rarely used anymore, and therefore not discussed here: the EBCDIC code and the (pre-OS X) Mac code page - see the WinAPI docs.

The active OEM code page is most easily obtained via chcp, as shown in Forgotten Semicolon's helpful answer - assuming the console window wasn't configured with a custom code page via the registry and that the code page wasn't explicitly changed in the session with chcp <codePageNum>.

Determining the active ANSI code page is not as simple, but PowerShell can help, also with determining the name and language of the system locale:

In Windows 8+ / Windows Server 2012+: Use the Get-WinSystemLocale cmdlet:

Get-WinSystemLocale | Select-Object Name, DisplayName, 
                        @{ n='OEMCP'; e={ $_.TextInfo.OemCodePage } }, 
                        @{ n='ACP';   e={ $_.TextInfo.AnsiCodePage } }

Caveat: The information returned does not reflect a potential UTF-8 override that may be in place via a new Windows 10 feature (see this SO answer); instead, the information always reflects the code pages originally associated with the active system locale. If you do need to know whether the UTF-8 override is in effect, see the registry-based method below.

On a US-English system, the above yields:

Name  DisplayName             OEMCP  ACP
----  -----------             -----  ---
en-US English (United States)   437 1252

OEMCP is the OEM code page, ACP the ANSI code page.

A registry-based method that also works on older systems down to Windows XP:

# Get the code pages:
Get-ItemProperty HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage | 
     Select-Object OEMCP, ACP

On a US-English system, the above yields:

OEMCP ACP 
----- --- 
437   1252

If you also want get the system locale's [friendly] name and LCID (though note that LCIDs are deprecated):

[Globalization.CultureInfo]::GetCultureInfo([int] ('0x' + (
        Get-ItemProperty 'HKLM:\SYSTEM\CurrentControlSet\Control\Nls\Language' Default
      ).Default)
)

On a US-English system, the above yields:

LCID             Name             DisplayName                                                                                                                                      
----             ----             -----------                                                                                                                                      
1033             en-US            English (United States)                                                                                                                          

Background information:

System locale is the legacy name for what is now more descriptively called language for non-Unicode programs (see NLS terminology), and, as the names suggest:

  • The setting applies only to legacy programs (programs that don't support Unicode).

  • It applies system-wide, irrespective of a given user's locale settings, and administrative privileges are required to change it.

It is important to note that is is a legacy setting, because code pages no longer apply to programs that use Unicode internally and call the Unicode versions of the Windows API.

Notably, it determines the active code pages, i.e., the character encoding used by default:

  • the ANSI code page to use when non-Unicode programs call the non-Unicode (ANSI) versions of the Windows API, notably the ANSI version of the TextOut function for translating strings to and from Unicode, which notably determines how the program's strings render in the GUI.

    • Using standard features, you cannot change this code page on demand, so you cannot selectively run non-Unicode programs with a different ANSI code page.
  • Arioch 'The points to legacy Microsoft program AppLocale that up to Windows 7 could be used to run an individual program with a selectable system locale on demand; however, it doesn't seem to available for download any longer and doesn't seem to work anymore in Windows 10.

  • Locale Emulator, an open-source third-party solution, seemingly picks up where AppLocale left off (I haven't tried it), and is supported for 32-bit applications on Windows 10.

  • the OEM code page to make active by default in console windows, as reflected by chcp.

    • A console window's active code page determines how keyboard input and output from console applications is interpreted and displayed.
      • Note that that means that even output from Unicode console applications is translated to the active code page, which can result in loss of information; use of pseudo code page 65001, which represents the UTF-8 encoding of Unicode, is a solution, but that can cause legacy command-line programs to misinterpret data and even to fail - see this StackOverflow answer for details.
    • Unlike the ANSI code page, you can change the active [OEM] code page on demand for a given console window; e.g., to switch to OEM code page 850, run chcp 850 in cmd.exe, and $OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [text.encoding]::GetEncoding(850) in PowerShell.
  • additionally, the rarely used anymore EBCDIC and Mac code pages.

Despite the word locale used in the legacy term and the word language in the current term:

  • The only aspects controlled by the setting are the set of active code pages and the default bitmap fonts, not also other elements of a locale (which are controlled by the user-level locale settings).

  • A given code page is typically shared by many locales and covers multiple languages; e.g., the widely used 1252 code page is used by many Western European languages, including English.

However, when you do change the setting via the Control Panel, you do pick the setting by way of a specific locale.

For a list of all Windows code pages, see https://docs.microsoft.com/en-us/windows/desktop/Intl/code-page-identifiers