How to check encoding of a CSV file

In Linux systems, you can use file command. It will give the correct encoding

Sample:

file blah.csv

Output:

blah.csv: ISO-8859 text, with very long lines

You can use Notepad++ to evaluate a file's encoding without needing to write code. The evaluated encoding of the open file will display on the bottom bar, far right side. The encodings supported can be seen by going to Settings -> Preferences -> New Document/Default Directory and looking in the drop down.


You can also use python chardet library

# install the chardet library
!pip install chardet

# import the chardet library
import chardet 

# use the detect method to find the encoding
# 'rb' means read in the file as binary
with open("test.csv", 'rb') as file:
    print(chardet.detect(file.read()))

If you use Python, just use a print() function to check the encoding of a csv file. For example:

with open('file_name.csv') as f:
    print(f)

The output is something like this:

<_io.TextIOWrapper name='file_name.csv' mode='r' encoding='utf8'>

Tags:

Csv

Encoding