How to determine encoding table of a text file

You can't reliably detect the encoding from a textfile - what you can do is make an educated guess by searching for a non-ascii char and trying to determine if it is a unicode combination that makes sens in the languages you are parsing.

See this question and the selected answer. Thereâs no sure-fire way of doing it. At most, you can rule things out. The UTF encodings youâre unlikely to get false positives on, but the 8-bit encodings are tough, especially if you donât know the starting language. No tool out there currently handles all the common 8-bit encodings from Macs, Windows, Unix, but the selected answer provides an algorithmic approach that should work adequately for a certain subset of encodings.

If you're on Linux, try file -i filename.txt.

$ file -i vol34.tex 
vol34.tex: text/x-tex; charset=us-ascii

For reference, here is my environment:

$ which file
/usr/bin/file
$ file --version
file-5.09
magic file from /etc/magic:/usr/share/misc/magic

Some file versions (e.g. file-5.04 on OS X/macOS) have slightly different command-line switches:

$ file -I vol34.tex 
vol34.tex: text/x-tex; charset=us-ascii
$ file --mime vol34.tex
vol34.tex: text/x-tex; charset=us-ascii

Also, have a look here.

Open the file with Notepad++ and will see on the right down corner the encoding table name. And in the menu encoding you can change the encoding table and save the file.

How to determine encoding table of a text file

Tags:

Text

Unicode

Encoding

Character Encoding

Related

Recent Posts