How are integers internally represented at a bit level in Java?

Java integers are of 32 bits, and always signed. This means, the most significant bit (MSB) works as the sign bit. The integer represented by an int is nothing but the weighted sum of the bits. The weights are assigned as follows:

Bit#    Weight
31      -2^31
30       2^30
29       2^29
...      ...
2        2^2
1        2^1
0        2^0

Note that the weight of the MSB is negative (the largest possible negative actually), so when this bit is on, the whole number (the weighted sum) becomes negative.

Let's simulate it with 4-bit numbers:

Binary    Weighted sum            Integer value
0000       0 + 0 + 0 + 0           0
0001       0 + 0 + 0 + 2^0         1
0010       0 + 0 + 2^1 + 0         2
0011       0 + 0 + 2^1 + 2^0       3
0100       0 + 2^2 + 0 + 0         4
0101       0 + 2^2 + 0 + 2^0       5
0110       0 + 2^2 + 2^1 + 0       6
0111       0 + 2^2 + 2^1 + 2^0     7 -> the most positive value
1000      -2^3 + 0 + 0 + 0        -8 -> the most negative value
1001      -2^3 + 0 + 0 + 2^0      -7
1010      -2^3 + 0 + 2^1 + 0      -6
1011      -2^3 + 0 + 2^1 + 2^0    -5
1100      -2^3 + 2^2 + 0 + 0      -4
1101      -2^3 + 2^2 + 0 + 2^0    -3
1110      -2^3 + 2^2 + 2^1 + 0    -2
1111      -2^3 + 2^2 + 2^1 + 2^0  -1

So, the two's complement thing is not an exclusive scheme for representing negative integers, rather we can say that the binary representation of integers are always the same, we just negate the weight of the most significant bit. And that bit determines the sign of the integer.

In C, there is a keyword unsigned (not available in java), which can be used for declaring unsigned int x;. In the unsigned integers, the weight of the MSB is positive (2^31) rather than being negative. In that case the range of an unsigned int is 0 to 2^32 - 1, while an int has range -2^31 to 2^31 - 1.

From another point of view, if you consider the two's complement of x as ~x + 1 (NOT x plus one), here's the explanation:

For any x, ~x is just the bitwise inverse of x, so wherever x has a 1-bit, ~x will have a 0-bit there (and vice versa). So, if you add these up, there will be no carry in the addition and the sum will be just an integer every bit of which is 1.

For 32-bit integers:

x + ~x = 1111 1111 1111 1111 1111 1111 1111 1111
x + ~x + 1 =   1111 1111 1111 1111 1111 1111 1111 1111 + 1
           = 1 0000 0000 0000 0000 0000 0000 0000 0000

The leftmost 1-bit will simply be discarded, because it doesn't fit in 32-bits (integer overflow). So,

x + ~x + 1 = 0
-x = ~x + 1

So you can see that the negative x can be represented by ~x + 1, which we call the two's complement of x.


I have ran the following program to know it

public class Negative {
    public static void main(String[] args) {
        int i =10;
        int j = -10;

        System.out.println(Integer.toBinaryString(i));
        System.out.println(Integer.toBinaryString(j));
    }
}

Output is

1010
11111111111111111111111111110110

From the output it seems that it has been using two's complement.


Let's start by summarizing Java primitive data types:

byte: Byte data type is an 8-bit signed two's complement integer.

Short: Short data type is a 16-bit signed two's complement integer.

int: Int data type is a 32-bit signed two's complement integer.

long: Long data type is a 64-bit signed two's complement integer.

float: Float data type is a single-precision 32-bit IEEE 754 floating point.

double: double data type is a double-precision 64-bit IEEE 754 floating point.

boolean: boolean data type represents one bit of information.

char: char data type is a single 16-bit Unicode character.

Source

Two's complement

"The good example is from wiki that the relationship to two's complement is realized by noting that 256 = 255 + 1, and (255 − x) is the ones' complement of x

0000 0111=7 two's complement is 1111 1001= -7

the way it works is the MSB(the most significant bit) receives a negative value so in the case above

-7 = 1001= -8 + 0+ 0+ 1

Positive integers are generally stored as simple binary numbers (1 is 1, 10 is 2, 11 is 3, and so on).

Negative integers are stored as the two's complement of their absolute value. The two's complement of a positive number is when using this notation a negative number.

Source

Since I received a few points for this answer, I decided to add more information to it.

A more detailed answer:

Among others there are four main approaches to represent positive and negative numbers in binary, namely:

  1. Signed Magnitude
  2. One's Complement
  3. Two's Complement
  4. Bias

1. Signed Magnitude

Uses the most significant bit to represent the sign, the remaining bits are used to represent the absolute value. Where 0 represents a positive number and 1 represents a negative number, example:

1011 = -3
0011 = +3

This representation is simpler. However, you cannot add binary numbers in the same way that you add decimal numbers, making it harder to be implemented at the hardware level. Moreover, this approach uses two binary patterns to represent the 0, -0 (1000) and +0 (0000).

2. One's Complement

In this representation, we invert all the bits of a given number to find out its complementary. For example:

010 = 2, so -2 = 101 (inverting all bits).

The problem with this representation is that there still exist two bits patterns to represent the 0, negative 0 (1111) and positive 0 (0000)

3. Two's Complement

To find the negative of a number, in this representation, we invert all the bits and then add one bit. Adding one bit solves the problem of having two bits patterns representing 0. In this representation, we only have one pattern for 0 (0000).

For example, we want to find the binary negative representation of 4 (decimal) using 4 bits. First, we convert 4 to binary:

4 = 0100

then we invert all the bits

0100 -> 1011

finally, we add one bit

1011 + 1 = 1100.

So 1100 is equivalent to -4 in decimal if we are using a Two's Complement binary representation with 4 bits.

A faster way to find the complementary is by fixing the first bit that as value 1 and inverting the remaining bits. In the above example it would be something like:

0100 -> 1100
^^ 
||-(fixing this value)
|--(inverting this one)

Two's Complement representation, besides having only one representation for 0, it also adds two binary values in the same way that in decimal, even numbers with different signs. Nevertheless, it is necessary to check for overflow cases.

4. Bias

This representation is used to represent the exponent in the IEEE 754 norm for floating points. It has the advantage that the binary value with all bits to zero represents the smallest value. And the binary value with all bits to 1 represents the biggest value. As the name indicates, the value is encoded (positive or negative) in binary with n bits with a bias (normally 2^(n-1) or 2^(n-1)-1).

So if we are using 8 bits, the value 1 in decimal is represented in binary using a bias of 2^(n-1), by the value:

+1 + bias = +1 + 2^(8-1) = 1 + 128 = 129
converting to binary
1000 0001