What is faster than std::pow?

It looks like Martin Ankerl has a few of articles on this, Optimized Approximative pow() in C / C++ is one and it has two fast versions, one is as follows:

inline double fastPow(double a, double b) {
  union {
    double d;
    int x[2];
  } u = { a };
  u.x[1] = (int)(b * (u.x[1] - 1072632447) + 1072632447);
  u.x[0] = 0;
  return u.d;
}

which relies on type punning through a union which is undefined behavior in C++, from the draft standard section 9.5 [class.union]:

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...]

but most compilers including gcc support this with well defined behavior:

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type

but this is not universal as this article points out and as I point out in my answer here using memcpy should generate identical code and does not invoke undefined behavior.

He also links to a second one Optimized pow() approximation for Java, C / C++, and C#.

The first article also links to his microbenchmarks here

Depending on what you need to do, operating in the log domain might work — that is, you replace all of your values with their logarithms; multiplication becomes addition, division becomes subtraction, and exponentiation becomes multiplication. But now addition and subtraction become expensive and somewhat error-prone operations.

How big are your integers? Are they known at compile time? It's far better to compute x^2 as x*x as opposed to pow(x,2). Note: Almost all applications of pow() to an integer power involve raising some number to the second or third power (or the multiplicative inverse in the case of negative exponents). Using pow() is overkill in such cases. Use a template for these small integer powers, or just use x*x.

If the integers are small, but not known at compile time, say between -12 and +12, multiplication will still beat pow() and won't lose accuracy. You don't need eleven multiplications to compute x^12. Four will do. Use the fact that x^(2n) = (x^n)^2 and x^(2n+1) = x*((x^n)^2). For example, x^12 is ((x*x*x)^2)^2. Two multiplications to compute x^3 (x*x*x), one more to compute x^6, and one final one to compute x^12.

What is faster than std::pow?

Tags:

Performance

C++

Related

Recent Posts