Is ((a + (b & 255)) & 255) the same as ((a + b) & 255)?

Lemma: a & 255 == a % 256 for unsigned a.

Unsigned a can be rewritten as m * 0x100 + b some unsigned m,b, 0 <= b < 0xff, 0 <= m <= 0xffffff. It follows from both definitions that a & 255 == b == a % 256.

Additionally, we need:

  • the distributive property: (a + b) mod n = [(a mod n) + (b mod n)] mod n
  • the definition of unsigned addition, mathematically: (a + b) ==> (a + b) % (2 ^ 32)

Thus:

(a + (b & 255)) & 255 = ((a + (b & 255)) % (2^32)) & 255      // def'n of addition
                      = ((a + (b % 256)) % (2^32)) % 256      // lemma
                      = (a + (b % 256)) % 256                 // because 256 divides (2^32)
                      = ((a % 256) + (b % 256 % 256)) % 256   // Distributive
                      = ((a % 256) + (b % 256)) % 256         // a mod n mod n = a mod n
                      = (a + b) % 256                         // Distributive again
                      = (a + b) & 255                         // lemma

So yes, it is true. For 32-bit unsigned integers.


What about other integer types?

  • For 64-bit unsigned integers, all of the above applies just as well, just substituting 2^64 for 2^32.
  • For 8- and 16-bit unsigned integers, addition involves promotion to int. This int will definitely neither overflow or be negative in any of these operations, so they all remain valid.
  • For signed integers, if either a+b or a+(b&255) overflow, it's undefined behavior. So the equality can't hold — there are cases where (a+b)&255 is undefined behavior but (a+(b&255))&255 isn't.

They are the same. Here's a proof:

First note the identity (A + B) mod C = (A mod C + B mod C) mod C

Let's restate the problem by regarding a & 255 as standing in for a % 256. This is true since a is unsigned.

So (a + (b & 255)) & 255 is (a + (b % 256)) % 256

This is the same as (a % 256 + b % 256 % 256) % 256 (I've applied the identity stated above: note that mod and % are equivalent for unsigned types.)

This simplifies to (a % 256 + b % 256) % 256 which becomes (a + b) % 256 (reapplying the identity). You can then put the bitwise operator back to give

(a + b) & 255

completing the proof.


Yes, (a + b) & 255 is fine.

Remember addition in school? You add numbers digit by digit, and add a carry value to the next column of digits. There is no way for a later (more significant) column of digits to influence an already processed column. Because of this, it does not make a difference if you zero-out the digits only in the result, or also first in an argument.


The above is not always true, the C++ standard allows an implementation that would break this.

Such a Deathstation 9000 :-) would have to use a 33-bit int, if the OP meant unsigned short with "32-bit unsigned integers". If unsigned int was meant, the DS9K would have to use a 32-bit int, and a 32-bit unsigned int with a padding bit. (The unsigned integers are required to have the same size as their signed counterparts as per §3.9.1/3, and padding bits are allowed in §3.9.1/1.) Other combinations of sizes and padding bits would work too.

As far as I can tell, this is the only way to break it, because:

  • The integer representation must use a "purely binary" encoding scheme (§3.9.1/7 and the footnote), all bits except padding bits and the sign bit must contribute a value of 2n
  • int promotion is allowed only if int can represent all the values of the source type (§4.5/1), so int must have at least 32 bits contributing to the value, plus a sign bit.
  • the int can not have more value bits (not counting the sign bit) than 32, because else an addition can not overflow.

In positional addition, subtraction and multiplication of unsigned numbers to produce unsigned results, more significant digits of the input don't affect less-significant digits of the result. This applies to binary arithmetic as much as it does to decimal arithmetic. It also applies to "twos complement" signed arithmetic, but not to sign-magnitude signed arithmetic.

However we have to be careful when taking rules from binary arithmetic and applying them to C (I beleive C++ has the same rules as C on this stuff but i'm not 100% sure) because C arithmetic has some arcane rules that can trip us up. Unsigned arithmetic in C follows simple binary wraparound rules but signed arithmetic overflow is undefined behaviour. Worse under some circumstances C will automatically "promote" an unsigned type to (signed) int.

Undefined behaviour in C can be especially insiduous. A dumb compiler (or a compiler on a low optimisation level) is likely to do what you expect based on your understanding of binary arithmetic while an optimising compiler may break your code in strange ways.


So getting back to the formula in the question the equivilence depends on the operand types.

If they are unsigned integers whose size is greater than or equal to the size of int then the overflow behaviour of the addition operator is well-defined as simple binary wraparound. Whether or not we mask off the high 24 bits of one operand before the addition operation has no impact on the low bits of the result.

If they are unsigned integers whose size is less than int then they will be promoted to (signed) int. Overflow of signed integers is undefined behaviour but at least on every platform I have encountered the difference in size between different integer types is large enough that a single addition of two promoted values will not cause overflow. So again we can fall back to the simply binary arithmetic argument to deem the statements equivalent.

If they are signed integers whose size is less than int then again overflow can't happen and on twos-complement implementations we can rely on the standard binary arithmetic argument to say they are equivilent. On sign-magnitude or ones complement implementations they would not be equivilent.

OTOH if a and b were signed integers whose size was greater than or equal to the size of int then even on twos complement implementations there are cases where one statement would be well-defined while the other would be undefined behaviour.

Tags:

C++

Binary

Logic