What does it mean for a program to be 32 or 64 bit?

Using software like Word/Excel/etc, the installers have the option for a 32bit or a 64bit installation. What is the difference?

This depends on the CPU used:

On SPARC CPUs, the difference between "32-bit" and "64-bit" programs is exactly what you think:

64-bit programs use additional operations that are not supported by 32-bit SPARC CPUs. On the other hand the Solaris or Linux operating system places the data accessed by 64-bit programs in memory areas which can only be accessed using 64-bit instructions. This means that a 64-bit program even MUST use instructions not supported by 32-bit CPUs.

For x86 CPUs this is different:

Modern x86 CPUs have different operating modes and they can execute different types of code. In the different modes, they can execute 16-, 32- or 64-bit code.

In 16-, 32- and 64-bit code, the CPU interprets the bytes differently:

The bytes (hexadecimal) b8 4e 61 bc 00 c3 would be interpreted as:

mov    eax,0xbc614e
ret

... in 32-bit code and as:

mov    ax,0x614e
mov    sp,0xc300

... in 16 bit code.

The bytes in the EXE file of the "64-bit installation" and of the "32-bit installation" must be interpreted differently by the CPU.

And a 64 bit program would make efforts to align instruction sets with 64 bit word sizes.

16-bit code (see above) can access 32-bit registers when the CPU is not a 16-bit CPU.

So a "16-bit program" can access 32-bit registers on a 32- or 64-bit x86 CPU.


Word size is a major difference, but it's not the only one. It tends to define the number of bits a CPU is "rated" for, but word size and overall capability are only loosely related. And overall capability is what matters.

On an Intel or AMD CPU, 32-bit vs. 64-bit software really refers to the mode in which the CPU operates when running it. 32-bit mode has fewer/smaller registers and instructions available, but the most important limitation is the amount of memory available. 32-bit software is generally limited to using between 2GB and just under 4GB of memory.

Each byte of memory has a unique address, which is not very different from each house having a unique postal address. A memory address is just a number that a program can use to find a piece of data again once it has saved it in memory, and each byte of memory has to have an address. If an address is 32 bits, then there are 2^32 possible addresses, and that means 2^32 addressable bytes of memory. On today's Intel/AMD CPUs, the size of a memory address is the same as the size of the registers (although this wasn't always true).

With 32 bit addresses, 4GB (2^32 bytes) can be addressed by the program, however up to half of that space is reserved by the OS. Into the available memory space must fit program code, data, and often also files being accessed. In today's PCs, with many gigabytes of RAM, this fails to take advantage of available memory. That is the main reason why 64-bit has become popular. 64-bit CPUs were available and widely used (typically in 32-bit mode) for several years, until memory sizes larger than 2GB became common, at which point 64-bit mode started to offer real-world advantages and it became popular. 64 bits of memory address space provides 16 exabytes of addressable memory (~18 quintillion bytes), which is more than any current software can use, and certainly no PC has anywhere near that much RAM.

The majority of data used in typical applications, even in 64-bit mode, does not need to be 64-bit and so most of it is still stored in 32-bit (or even smaller) formats. The common ASCII and UTF-8 representations of text use 8-bit data formats. If the program needs to move a large block of text from one place to another in memory, it may try to do it 64 bits at a time, but if it needs to interpret the text, it will probably do it 8 bits at a time. Similarly, 32 bits is a common size for integers (maximum range of +/- 2^31, or approximately +/- 2.1 billion). 2.1 billion is enough range for many uses. Graphics data is usually naturally represented pixel by pixel, and each pixel, usually, contains at most 32 bits of data.

There are disadvantages to using 64-bit data needlessly. 64-bit data takes up more space in memory, and more space in the CPU cache (very fast memory used by the CPU for short-term storage). Memory can only transfer data at a maximum rate, and 64-bit data is twice as big. This can reduce performance if used wastefully. And if it's necessary to support both 32-bit and 64-bit versions of software, using 32-bit values where possible can reduce the differences between the two versions and make development easier (doesn't always work out that way, though).

Prior to 32-bit, the address and word size were usually different (e.g. 16-bit 8086/88 with 20-bit memory addresses but 16-bit registers, or 8-bit 6502 with 16-bit memory addresses, or even early 32-bit ARM with 26-bit addresses). While no programmer ever turned up their nose at better registers, memory space was usually the real driving force for each advancing generation of technology. This is because most programmers rarely work directly with registers, but do work directly with memory, and memory limitations directly cause unpleasantness for the programmer, and in the 32-bit to 64-bit case, for the user as well.

To sum up, while there are real and important technological differences between the various bit sizes, what 32-bit or 64-bit (or 16-bit or 8-bit) really means is simply a collection of capabilities that tend to be associated with CPUs of a particular technological generation, and/or software that takes advantage of those capabilities. Word length is a part of that, but not the only, or necessarily the most important part.

Source: Have been programmer through all these technological eras.


A program runs on top of a given architecture (arch, or ISA), which is implemented by processors. Typically, an architecture defines a "main" word size, which is the size most of the registers and operations on those registers run (although you can design architectures that work differently). This is usually called the "native" word size, although an architecture may allow operations using different sized registers.

Further, processors use memory, and need to address that memory somehow -- this means operating with those addresses. Therefore, the addresses are typically able to be stored and manipulated like any other number, which means you have registers capable of holding them. Although it is not required that those registers to match the word size nor it is required that an address is computed out of a single register, in some architectures this is the case.

Throughout history, there have been many architectures of different word sizes, even weird ones. Nowadays, you can easily find processors around you that are not just 32-bit and 64-bit, but also e.g. 8-bit and 16-bit (typically in embedded devices). In the typical desktop computer, you are using x86 or x64, which are 32-bit and 64-bit respectively.

Therefore, when you say that a program is 32-bit or 64-bit, you are referring to a particular architecture. In the popular desktop scenario, you are referring to x86 vs. x64. There are many questions, articles and books discussing the differences between the two.

Now, a final note: for compatibility reasons, x64 processors can operate in different modes, one of which is capable of running the 32-bit code from x86. This means that if your computer is x64 (likely) and if your operating system has support for it (also likely, e.g. Windows 64-bit), it can still run programs compiled for x86.


The answer you reference describes benefits of 64-bit over 32-bit. As far as what's actually different about the program itself, it depends on your perspective.

Generally speaking, the program source code does not have to be different at all. Most programs can be written so that they compile perfectly well as either 32-bit or 64-bit programs, as controlled by appropriate choice of compiler and / or compiler options. There is often some impact on the source, however, in that a (C) compiler targeting 64-bit may choose to define its types differently. In particular, long int is ubiquitously 32 bits wide on 32-bit platforms, but it is 64 bits wide on many (but not all) 64-bit platforms. This can be a source of bugs in code that makes unwarranted assumptions about such details.

The main differences are all in the binary. 64-bit programs make use of the full instruction sets of their 64-bit target CPUs, which invariably contain instructions that 32-bit counterpart CPUs do not contain. They will use registers that 32-bit counterpart CPUs do not have. They will use function-call conventions appropriate for their target CPU, which often means passing more arguments in registers than 32-bit programs do. Use of these and other facilities of 64-bit CPUs affords functional advantages such as the ability to use more memory and (sometimes) improved performance.

Tags:

C

32Bit 64Bit