How to properly compare an integer and a floating-point value?

(Restricting this answer to positive numbers; generalisation is trivial.)

  1. Get the number of bits in your exponent for the float on your platform along with the radix. If you have an IEEE754 32 bit float then this is a trivial step.

  2. Use (1) to compute the largest non-integer value that can be stored in your float. std::numeric_limits doesn't specify this value, annoyingly, so you need to do this yourself. For 32 bit IEEE754 you could take the easy option: 8388607.5 is the largest non-integral type float.

  3. If your float is less than or equal to (2), then check if it's an integer or not. If it's not an integer then you can round it appropriately so as not to invalidate the <.

  4. At this point, the float is an integer. Check if it's within in the range of your long long. If it's out of range then the result of < is known.

  5. If you get this far, then you can safely cast your float to a long long, and make the comparison.


Here's what I ended up with.

Credit for the algorithm goes to @chux; his approach appears to outperform the other suggestions. You can find some alternative implementations in the edit history.

If you can think of any improvements, suggestions are welcome.

#include <cmath>
#include <limits>
#include <type_traits>

enum partial_ordering {less, equal, greater, unordered};

template <typename I, typename F>
partial_ordering compare_int_float(I i, F f)
{
    if constexpr (std::is_integral_v<F> && std::is_floating_point_v<I>)
    {
        return compare_int_float(f, i);
    }
    else
    {
        static_assert(std::is_integral_v<I> && std::is_floating_point_v<F>);
        static_assert(std::numeric_limits<F>::radix == 2);

        // This should be exactly representable as F due to being a power of two.
        constexpr F I_min_as_F = std::numeric_limits<I>::min();

        // The `numeric_limits<I>::max()` itself might not be representable as F, so we use this instead.
        constexpr F I_max_as_F_plus_1 = F(std::numeric_limits<I>::max()/2+1) * 2;

        // Check if the constants above overflowed to infinity. Normally this shouldn't happen.
        constexpr bool limits_overflow = I_min_as_F * 2 == I_min_as_F || I_max_as_F_plus_1 * 2 == I_max_as_F_plus_1;
        if constexpr (limits_overflow)
        {
            // Manually check for special floating-point values.
            if (std::isinf(f))
                return f > 0 ? less : greater;
            if (std::isnan(f))
                return unordered;
        }

        if (limits_overflow || f >= I_min_as_F)
        {
            // `f <= I_max_as_F_plus_1 - 1` would be problematic due to rounding, so we use this instead.
            if (limits_overflow || f - I_max_as_F_plus_1 <= -1)
            {
                I f_trunc = f;
                if (f_trunc < i)
                    return greater;
                if (f_trunc > i)
                    return less;

                F f_frac = f - f_trunc;
                if (f_frac < 0)
                    return greater;
                if (f_frac > 0)
                    return less;

                return equal;
            }

            return less;
        }

        if (f < 0)
            return greater;

        return unordered;
    }
}

If you want to experiment with it, here are a few test cases:

#include <cmath>
#include <iomanip>
#include <iostream> 

void compare_print(long long a, float b, int n = 0)
{
    if (n == 0)
    {
        auto result = compare_int_float(a,b);
        std::cout << a << ' ' << "<=>?"[int(result)] << ' ' << b << '\n';
    }
    else
    {
        for (int i = 0; i < n; i++)
            b = std::nextafter(b, -INFINITY);

        for (int i = 0; i <= n*2; i++)
        {
            compare_print(a, b);
            b = std::nextafter(b, INFINITY);
        }

        std::cout << '\n';
    }
}

int main()
{    
    std::cout << std::setprecision(1000);

    compare_print(999999984306749440,
                  999999984306749440.f, 2);

    compare_print(999999984306749439,
                  999999984306749440.f, 2);

    compare_print(100,
                  100.f, 2);

    compare_print(-100,
                  -100.f, 2);

    compare_print(0,
                  0.f, 2);

    compare_print((long long)0x8000'0000'0000'0000,
                  (long long)0x8000'0000'0000'0000, 2);

    compare_print(42, INFINITY);
    compare_print(42, -INFINITY);
    compare_print(42, NAN);
    std::cout << '\n';

    compare_print(1388608,
                  1388608.f, 2);

    compare_print(12388608,
                  12388608.f, 2);
}

(run the code)


To compare a FP f and integer i for equality:

(Code is representative and uses comparison of float and long long as an example)

  1. If f is a NaN, infinity, or has a fractional part (perhaps use frexp()), f is not equal to i.

    float ipart;
    // C++
    if (frexp(f, &ipart) != 0) return not_equal;
    // C
    if (frexpf(f, &ipart) != 0) return not_equal;
    
  2. Convert the numeric limits of i into exactly representable FP values (powers of 2) near those limits.** Easy to do if we assume FP is not a rare base 10 encoding and range of double exceeds the range on the i. Take advantage that integer limits magnitudes are or near Mersenne Number. (Sorry example code is C-ish)

    #define FP_INT_MAX_PLUS1 ((LLONG_MAX/2 + 1)*2.0)
    #define FP_INT_MIN (LLONG_MIN*1.0)
    
  3. Compare f to is limits

    if (f >= FP_INT_MAX_PLUS1) return not_equal;
    if (f < FP_INT_MIN) return not_equal;
    
  4. Convert f to integer and compare

    return (long long) f == i;
    

To compare a FP f and integer i for <, >, == or not comparable:

(Using above limits)

  1. Test f >= lower limit

    if (f >= FP_INT_MIN) {
    
  2. Test f <= upper limit

      // reform below to cope with effects of rounding
      // if (f <= FP_INT_MAX_PLUS1 - 1)
      if (f - FP_INT_MAX_PLUS1 <= -1.0) {
    
  3. Convert f to integer/fraction and compare

        // at this point `f` is in the range of `i`
        long long ipart = (long long) f;
        if (ipart < i) return f_less_than_i;
        if (ipart > i) return f_more_than_i;
    
        float frac = f - ipart;
        if (frac < 0) return f_less_than_i;
        if (frac > 0) return f_more_than_i;
        return equal;
      }
    
  4. Handle edge cases

      else return f_more_than_i;
    }
    if (f < 0.0) return f_less_than_i;
    return not_comparable;
    

Simplifications possible, yet I wanted to convey the algorithm.


** Additional conditional code needed to cope with non 2's complement integer encoding. It is quite similar to the MAX code.