Some day, everyone will have their act together when it comes to encoding and decoding of strings that have meaning. Until then, we need hacks.

This is what I do in Python programs that print to the screen. Currently I try to make my Python code work in both Python 2 and Python 3 - I look forward to being able to ditch Python 2, but not yet.

from __future__ import print_function, unicode_literals
import sys
PY2 = sys.version_info < (3,)

# This is a hack so that we have UTF-8 throughout in Python 2. The "correct" way is to use
# encode(s) at the point of output, Or switch to Python 3.
if PY2:

    # note - on windows, use chcp 65001 (default is chcp 437)
    # Fixed in Python 3.6 and up, apparently
    if sys.platform == "win32":
        import codecs
        codecs.register(lambda name: codecs.lookup('utf-8') if name == 'cp65001' else None)

This way, my programs don’t croak when they encounter Unicode codepoints.

This is mostly better in Python 3.5 and up, but still not totally seamless.


Windows cmd encoding change causes Python crash

Python, Unicode, and the Windows console

Python Helpers for String/Unicode Encoding, Decoding and Printing

Print to the console in Python without UnicodeEncodeErrors - note, this did not work for me in Python 2.7.9

Defining Python Source Code Encodings - I think this is irrelevant now, doesn’t Python assume that your source code is UTF-8?

Setting the correct encoding when piping stdout in Python

Writing unicode strings via sys.stdout in Python