Why does C++ standard specify signed integer be cast to unsigned in binary operations with mixed signedness?

Casting from unsigned to signed results in implementation-defined behaviour if the value cannot be represented. Casting from signed to unsigned is always modulo two to the power of the unsigned's bitsize, so it is always well-defined.

The standard conversion is to the signed type if every possible unsigned value is representable in the signed type. Otherwise, the unsigned type is chosen. This guarantees that the conversion is always well-defined.


Notes

  1. As indicated in comments, the conversion algorithm for C++ was inherited from C to maintain compatibility, which is technically the reason it is so in C++.

  2. It has been suggested that the decision in the standard to define signed to unsigned conversions and not unsigned to signed conversion is somehow arbitrary, and that the other possible decision would be symmetric. However, the possible conversion are not symmetric.

    In both of the non-2's-complement representations contemplated by the standard, an n-bit signed representation can represent only 2n−1 values, whereas an n-bit unsigned representation can represent 2n values. Consequently, a signed-to-unsigned conversion is lossless and can be reversed (although one unsigned value can never be produced). The unsigned-to-signed conversion, on the other hand, must collapse two different unsigned values onto the same signed result.

    In a comment, the formula sint = uint > sint_max ? uint - uint_max : uint is proposed. This coalesces the values uint_max and 0; both are both mapped to 0. That's a little weird even for non-2s-complement representations, but for 2's-complement it's unnecessary and, worse, it requires the compiler to emit code to laboriously compute this unnecessary conflation. By contrast the standard's signed-to-unsigned conversion is lossless and in the common case (2's-complement architectures) it is a no-op.


If the signed casting was chosen, then simple a+1 would always result in singed type (unless constant was typed as 1U).

Assume a was unsigned int, then this seemingly innocent increment a+1 could lead to things like undefined overflow or "index out of bound", in the case of arr[a+1]

Thus, "unsigned casting" seems like a safer approach because people probably don't even expect casting to be happening in the first place, when simply adding a constant.

Tags:

C++

C

Casting