Why do unsigned "small" integers promote to signed int?

This is addressed in the ANSI C Rationale (the link is to the relevant section, 3.2.1.1). It was, to some extent, an arbitrary choice that could have gone either way, but there are reasons for the choice that was made.

Since the publication of K&R, a serious divergence has occurred among implementations of C in the evolution of integral promotion rules. Implementations fall into two major camps, which may be characterized as unsigned preserving and value preserving. The difference between these approaches centers on the treatment of unsigned char and unsigned short, when widened by the integral promotions, but the decision has an impact on the typing of constants as well (see §3.1.3.2).

The unsigned preserving approach calls for promoting the two smaller unsigned types to unsigned int. This is a simple rule, and yields a type which is independent of execution environment.

The value preserving approach calls for promoting those types to signed int, if that type can properly represent all the values of the original type, and otherwise for promoting those types to unsigned int. Thus, if the execution environment represents short as something smaller than int, unsigned short becomes int; otherwise it becomes unsigned int.

[SNIP]

The unsigned preserving rules greatly increase the number of situations where unsigned int confronts signed int to yield a questionably signed result, whereas the value preserving rules minimize such confrontations. Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer. After much discussion, the Committee decided in favor of value preserving rules, despite the fact that the UNIX C compilers had evolved in the direction of unsigned preserving.

(I recommend reading the full section. I just didn't want to quote the whole thing here.)


An interesting portion of the Rationale snipped from Keith Thompson's answer:

Both schemes give the same answer in the vast majority of cases, and both give the same effective result in even more cases in implementations with twos-complement arithmetic and quiet wraparound on signed overflow --- that is, in most current implementations. In such implementations, differences between the two only appear when these two conditions are both true:

  1. An expression involving an unsigned char or unsigned short produces an int-wide result in which the sign bit is set: i.e., either a unary operation on such a type, or a binary operation in which the other operand is an int or ``narrower'' type.

  2. The result of the preceding expression is used in a context in which its signedness is significant:

    • sizeof(int) < sizeof(long) and it is in a context where it must be widened to a long type, or
    • it is the left operand of the right-shift operator (in an implementation where this shift is defined as arithmetic), or
    • it is either operand of /, %, <, <=, >, or >=.

Note that the Standard imposes no requirements on how an implementation processes any situation where quiet-wraparound behavior would be relevant. The clear implication is that the authors of the Standard expected that commonplace implementations for two's-complement platforms would behave as described above with or without a mandate, absent a compelling reason to do otherwise, and thus there was no need to mandate that they do so. While it would seem unlikely that they considered the possibility that a 32-bit implementation given something like:

unsigned mul(unsigned short x, unsigned short y) { return x*y; }

might aggressively exploit the fact that it wasn't required to accommodate values of x greater than 2147483647/y, some compilers for modern platforms treat the lack of requirement as an invitation to generate code that will malfunction in those cases.

Tags:

C

Standards