Encoding issue with python3 and click package

If you are running python 3.6 then you will still get this error. Here is a simple solution that the authors of click recommend:

#!/bin/bash
# before your python code executes set two environment variables
export LANG=en_US.utf8
export LC_ALL=en_US.utf8
  • NOTE: replace the values with whatever your locale is configured to
  • NOTE: this solution is even given in the PEP 538 document seen here.

It's an aged thread, however this answer might help other in the future or myself. If it's *nux

env | grep LC_ALL

if it's set, do the follows. That's all of it.

unset LC_ALL


If you have python version >= 3.7, then you should not need to do anything. If you have python 3.6 see the original solution.

EDIT 2017-12-08

I've seen that there is a PEP 538 for py3.7, that will change the entire behavior of python3 encoding management during startup, I think that the new approach will fix the original problem: https://www.python.org/dev/peps/pep-0538/

IMHO the changes targeted to python 3.7 for encoding issues, should have been planed years ago, but better late than never, I guess.

EDIT 2015-09-01

There is an opened issue (enhancement), http://bugs.python.org/issue15216, that will allow to change the encoding in a created (not-used) stream easily (sys.std*). But is targeted to python 3.7 So, we'll have to wait for a while.

Original solution that targets python version 3.6

NOTE: this solution should not be needed for anyone running python version >= 3.7 see PEP 538

Well, my initial workaround had many flaws, I got to pass the click library check about the encoding, but the encoding itself was not fixed, so I get exceptions when the input parameters or output had non-ascii characters.

I had to implement a more complex method, with 3 steps: set locale, correct encoding in std in/out and re-encode the command line parameters, besides I've added a "friendly" exit if the first try to set the locale doesn't work as expected:

def prevent_ascii_env():
    """
    To avoid issues reading unicode chars from stdin or writing to stdout, we need to ensure that the 
    python3 runtime is correctly configured, if not, we try to force to utf-8, 
    but It isn't possible then we exit with a more friendly message that the original one.
    """
    import locale, codecs, os, sys
    # locale.getpreferredencoding() == 'ANSI_X3.4-1968'
    if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
        os.environ['LANG'] = 'en_US.utf-8'
        if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
            print("The current locale is not correctly configured in your system")
            print("Please set the LANG env variable to the proper value before to call this script")
            sys.exit(-1)
        #Once we have the proper locale.getpreferredencoding() We can change current stdin/out streams
        _, encoding = locale.getdefaultlocale()
        import io
        sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding=encoding, errors="replace", line_buffering=True)
        sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding=encoding, errors="replace", line_buffering=True)
        sys.stdin = io.TextIOWrapper(sys.stdin.detach(), encoding=encoding, errors="replace", line_buffering=True)
        # And finally we need to re-encode the input parameters
        for i, p in enumerate(sys.argv):
            sys.argv[i] = os.fsencode(p).decode() 

This patch solves almost all issues, however it has a caveat, the method shutils.get_terminal_size() raises a ValueError because the sys.__stdout__ has been detached, click lib uses that method to print the help, to fix it I had to apply a monkey-patch on click lib

def wrapper_get_terminal_size():
    """
    Replace the original function termui.get_terminal_size (click lib) by a new one 
    that uses a fallback if ValueError exception has been raised
    """
    from click import termui, formatting
    
    old_get_term_size = termui.get_terminal_size
    def _wrapped_get_terminal_size():
        try:
            return old_get_term_size()
        except ValueError:
            import os
            sz = os.get_terminal_size()
            return sz.columns, sz.lines
    termui.get_terminal_size = _wrapped_get_terminal_size
    formatting.get_terminal_size = _wrapped_get_terminal_size

With this changes all my scripts work fine now when the environment has a wrong locale configured but the system supports en_US.utf-8 (It's the Fedora default locale).

If you find any issue on this approach or have a better solution, please add a new answer.