Can floating point equality and inequality tests be assumed to be consistent and repeatable?

Provided the x and y in the question are identifiers (rather than abbreviations for expressions generally, such as x standing for b + sqrt(c)), then the C++ standard requires (x >= y) == (x > y || x == y) to be true.

C++ 2017 (draft N4659) 8 13 allows floating-point expressions to be evaluated with greater precision and range than required by their nominal types. For example, while evaluating an operator with float operands, the implementation may use double arithmetic. However, footnote 64 there refers us to 8.4, 8.2.9, and 8.18 to understand that the cast and assignment operators must perform their specific conversions, which produce a value representable in the nominal type.

Thus, once x and y have been assigned values, there is no excess precision, and they do not have different values in different uses. Then (x >= y) == (x > y || x == y) must be true because it is evaluated as it appears and is necessarily mathematically true.

The existence of GCC bug 323 means you cannot rely on GCC when compiling for i386, but this is due to a bug in GCC which violates the C++ standard. Standard C++ does not permit this.

If comparisons are made between expressions, as in:

double y = b + sqrt(c);
if (y != b + sqrt(c))
    std::cout << "Unequal\n";

then the value assigned to y may differ from the value computed for the right operator of b + sqrt(c), and the string may be printed, because b + sqrt(c) may have excess precision, whereas y must not.

Since casts are also required to remove excess precision, then y != (double) (b + sqrt(c)) should always be false (given the definition of y above).


Regardless of the C++ standard, such inconsistencies occur in practice in various settings.

There are two examples which are easy to trigger:

For 32 bit x86, things are not so nice. Welcome to gcc bug number 323 due to which 32 bit applications don't adhere the standard. What happens is that the floating point registers of x86 have 80 bits, regardless of the type in the program (C, C++, or Fortran). This means that the following is usually is comparing 80 bit values, and not 64 bits:

bool foo(double x, double y) 
{
     // comparing 80 bits, despite sizeof(double) == 8, i.e., 64 bits
     return x == y;
}

This would not be a big issue if gcc could guarantee that double always takes 80 bits. Unfortunately, the number of floating point registers is finite, and sometimes the value is stored in (spilled to) memory. So, for the same x and y, x==y might evaluate as true after spilling to memory, and false without spilling to memory. There is no guarantee regarding to (lack of) spilling to memory. The behavior changes, seemingly, randomly based on compilation flags, and on seemingly irrelevant code changes.

So, even if x and y should be logically equal, and x is getting spilled, then x == y may evaluate as false since y contains a 1 bit in its least significant bit of the mantissa , but x got that bit truncated due to spilling. Then the answer to your second question is, x ==y may return different results in different places, based on spilling, or lack-of, in a 32-bit x86 program.

Similarly, x >= y may return true, even when y should be slightly bigger than x. This can happen if after spilling to a 64 bit variable in memory, the values become equal. In that case, if earlier in code x > y || x == y is evaluated without spilling to memory, then it it will evaluate as false. To make things more confusing, replacing one expression for the other may cause the compiler generate a slightly different code, with different spilling to memory. The difference in spilling, for the two expressions, may end up with giving inconsistently different results.

The same problem may happen in any system where floating point operations are executed in a different width (e.g. 80 bits for 32 bit x86) than what the code wants (64 bits). The only way to get around this inconsistency is to force spilling after each and every floating point operation, to truncate the excess of accuracy. Most programmers don't care for that, due to the performance degradation.

The second case that could trigger inconsistencies, is unsafe compiler optimizations. Many commercial compilers throw FP consistency out of the window, by default, in order to gain several percents of execution time. The compiler may decide to change the order of FP operations, even though they are likely to produce different results. For example:

v1 = (x + y) + z;
v2 = x + (y + z);
bool b = (v1 == v2);

It is clear that most likely v1 != v2, due to different rounding. For example, if x == -y, y > 1e100 and z == 1 then v1 == 1 but v2 == 0. If the compiler is too aggressive, then it might simply think of algebra and deduce that b should be true, without even evaluating anything. This is what happens when running gcc -ffast-math.

Here is an example that shows it.

Such behavior can make x == y become inconsistent, and heavily depend on what the compiler may deduce in a specific piece of code.