Unicode utf-8/utf-16 encoding in Python

It's a unicode character that doesn't seem to be displayable in your terminals encoding. print tries to encode the unicode object in the encoding of your terminal and if this can't be done you get an exception.

On a terminal that can display utf-8 you get:

>>> print u'\u3053'
こ

Your terminal doesn't seem to be able to display utf-8, else at least the print a.encode("utf-8") line should produce the correct character.


You ask:

u'\u3053\n'

Is it utf-16?

The answer is no: it's unicode, not any specific encoding. utf-16 is an encoding.

To print a Unicode string effectively to your terminal, you need to find out what encoding that terminal is willing to accept and able to display. For example, the Terminal.app on my laptop is set to UTF-8 and with a rich font, so:

screenshot
(source: aleax.it)

...the Hiragana letter displays correctly. On a Linux workstation I have a terminal program that keeps resetting to Latin-1 so it would mangle things somewhat like yours -- I can set it to utf-8, but it doesn't have huge number of glyphs in the font, so it would display somewhat-useless placeholder glyphs instead.