How to optimize range checking for integer intervals symmetric around zero in C?

How about the following:

counter += (i < -threshold) | (i > threshold);

Assuming the original code was valid, then this should work too, in a portable way. The standard says that relational operators (<, > and so on) return an int equal to 1 on success, or 0 on failure.

Update

To answer Sheen's comment below, the following code:

int main()
{
    short threshold = 10;
    short i = 20;
    short counter = 0;
    
    counter += (i < -threshold) | (i > threshold);
    
    return 0;
}

results in the following disassembler on x86 using GCC, with no optimisations:

  push   %rbp
  mov    %rsp,%rbp
  movw   $0xa,-6(%rbp)
  movw   $0x14,-4(%rbp)
  movw   $0x0,-2(%rbp)
  movswl -4(%rbp),%edx
  movswl -6(%rbp),%eax
  neg    %eax
  cmp    %eax,%edx
  setl   %dl
  movzwl -4(%rbp),%eax
  cmp    -6(%rbp),%ax
  setg   %al
  or     %edx,%eax
  movzbw %al,%dx
  movzwl -2(%rbp),%eax
  lea    (%rdx,%rax,1),%eax
  mov    %ax,-2(%rbp)
  mov    $0x0,%eax
  leaveq 
  retq  

There is a standard idiom for range-checking with a single comparison instruction. It goes like:

(unsigned)x - a <= (unsigned)b - a   /* a <= x <= b */
(unsigned)x - a < (unsigned)b - a    /* a <= x < b */

As a common example (this version if isdigit is guaranteed to be correct by the standard):

(unsigned)ch - '0' < 10

If your original type is larger than int (for instance long long) then you will need to use larger unsigned types (for instance unsigned long long). If a and b are constants or already have unsigned type, or if you know b-a will not overflow, you can omit the cast from b.

In order for this method to work, naturally you must have a<=b and the types/values must be such that the original expression (i.e. a <= x && x <= b or similar) behaves mathematically correctly. For instance if x were signed and b unsigned, x<=b could evaluate to false when x=-1 and b=UINT_MAX-1. As long as your original types are all signed or smaller than the unsigned type you cast to, this is not an issue.

As for how this "trick" works, it is purely determining, after reduction modulo UINT_MAX+1, whether x-a lies in the range 0 to b-a.

In your case, I think the following should work just fine:

(unsigned)i + threshold > 2U * threshold;

If threshold does not change between loop iterations, the compiler can probably keep both threshold and 2U*threshold in registers.

Speaking of optimizations, a good compiler should optimize your original range test to use unsigned arithmetic where it knows the constraints are met. I suspect many do so with a and b constant, but perhaps not with more complex expressions. Even if the compiler can optimize it, though, the (unsigned)x-a<b-a idiom is still extremely useful in macros where you want to ensure that x is evaluated exactly once.


Oh, too bad the question has already been answered. To paraphrase Oli's answer, the code

#include <stdint.h>
int main()
{
    int32_t threshold_square = 100;
    int16_t i = 20;
    int16_t counter = 0;

    counter += ( (int32_t) i * i > threshold_square);

    return 0;
}

yields the following x86 assembler using GCC without optimizations

pushq   %rbp
movq    %rsp, %rbp
movl    $100, -8(%rbp)
movw    $20, -2(%rbp)
movw    $0, -4(%rbp)
movswl  -2(%rbp),%edx
movswl  -2(%rbp),%eax
imull   %edx, %eax
cmpl    -8(%rbp), %eax
setg    %al
movzbl  %al, %edx
movzwl  -4(%rbp), %eax
leal    (%rdx,%rax), %eax
movw    %ax, -4(%rbp)
movl    $0, %eax
leave
ret

which is four instructions less than using (i < -threshold) | (i > threshold).

Whether this is better or not is, of course, depending on the architecture.

(The use of stdint.h is for illustrative purposes, for strict C89 replace with whatever is relevant for the target system.)