How to view a binary file?

According to this answer by tyranid:

hexdump -C yourfile.bin 

unless you want to edit it of course. Most Linux distros have hexdump by default (but obviously not all).


Update

According to this answer by Emilio Bool:

xxd does both binary and hexadecimal

For bin :

xxd -b file

For hex :

xxd file

Various people have answered some aspects of the query, but not all.

All files on computers are stored as 1's and 0's. Images, text files, music, executable applications, object files, etc.

They are all 0's and 1's. The only difference is that they are interpreted differently depending upon what opens them.

When you view a text file using cat, the executable (cat in this case) reads all the 1's and 0's and it presents them to you by converting them into characters from your relevant alphabet or language.

When you view a file using an image viewer, it takes all the 1's and 0's and turns them into an image, depending on the format of the file and some logic to work it all out.

Compiled binary files are no different, they are stored as 1's and 0's.

arzyfex's answer gives you the tools to view those files in different ways, but reading a file as binary works for any file on a computer, as does viewing it as octal, or hex, or indeed ASCII, it just might not make sense in each of those formats.

If you want to understand what an executable binary file does, you need to view it in a way which shows you the assembler language (as a start), which you can do using,

objdump -d /path/to/binary

which is a disassembler, it takes the binary content and converts it back into assembler (which is a very low level programming language). objdump is not always installed by default, so may need to be installed depending on your Linux environment.

Some external reading.

  • https://en.wikipedia.org/wiki/Binary_number
  • https://en.wikipedia.org/wiki/Assembly_language

NB: as @Wildcard points out, it's important to note the files don't contain the characters 1 and 0 (as you see them on the screen), they contain actual numeric data, individual bits of information which are either on (1) or off (0). Even that description is only an approximation of the truth. They key point is that if you do find a viewer which shows you the 1's and 0's, even that is still interpreting the data from the file and then showing you the ASCII characters for 0 and 1. The data is stored in a binary format (see the Binary number link above). Pierre-Olivier's community wiki entry covers this in more detail.


At low level, a file is encoded as a sequence of 0's and 1's.

But even programmers rarely go there in practice.

First (and more important than this story of 0's and 1's), you have to understand that anything that the computer manipulates is encoded with numbers.

  • A character is coded with a number, using character set tables. For example, the letter 'A' has a value of 65 when coded using ASCII. See http://www.asciitable.com

  • A pixel is coded with one or more numbers (There are a lot of graphical formats) For example, in standard 3-colors format, a yellow pixel is encoded as : 255 for Red, 255 for Green, 0 for Blue. See http://www.quackit.com/css/css_color_codes.cfm (choose a color and see the R,G & B cells)

  • A binary-executable file is written in Assembly; each assembly instruction is coded as numbers. For example, the assembly instruction MOVB $0x61,%al is coded by two numbers : 176,97 See http://www.sparksandflames.com/files/x86InstructionChart.html (Each instruction has an associated number from 00 to FF, because the hexadecimal notation is used, see below)

Secondly : each number can have multiple representations or notations.

Say I have 23 apples.

  • If I make groups of ten apples, I will get: 2 groups of ten and 3 lone apples. That's exactly what we mean when we write 23 : a 2 (tens), then a 3 (units).
  • But I can also make groups of 16 apples. So I'll get one Group-of-16, and 7 lone apples. In hexadecimal notation (that's how called the 16 radix), I'll write : 17 (16 + 7). To distinguish from decimal notation, hexadecimal notation is generally noted with a prefix or a suffix : 17h, #17 or $17. But how to represent more than 9 Group-of-16, or more of 9 alone-apples? Simply, we use letters from A (10) to F (15). The number 31 (as in 31 apples) is written as #1F in hexadecimal.

  • On the same line, we can do group-of-two-apples. (And group of two group-of-two apples, i.e group-of-2x2-apples, and so on). Then 23 is : 1 group-of-2x2x2x2-apples, 0 group-of-2x2x2-apples, 1 group-of-2x2-apples, 1 group of 2 apples, and 1 lone apple Which will be noted 10111 in binary.

(See https://en.wikipedia.org/wiki/Radix)

Physically, mechanisms allowing two states (switches) are easy to do, as well on disk that in memory storage.

That's why data and programs, seen as numbers, are written and manipulated in their binary form.

Then translated - depending upon the data type - into their appropriate form (letter A, yellow pixel) or executed (MOV instruction).

hexdump lists the numbers coding the data (or the assembly program) in it's hexadecimal form. You can then use a calculator to get the corresponding binary form.