Fast integer square-root

Sqrt is using exact methods in an effort to pull out "small" squares. This is going to take time. A direct approach, as already noted, would do the square root numerically. For purposes of Floor extraction, it suffices to use as precision half the digit size.

floorSqrt[n_Integer] := 
 Module[{prec = RealExponent[1.001*n]/2}, 
  Floor[Sqrt[SetPrecision[n, prec]]]]

Test:

n = 10^1000000 - 3^2095903;
AbsoluteTiming[sn1 = Floor[Sqrt[n]];]
AbsoluteTiming[sn2 = floorSqrt[n];]
sn1 === sn2

(* Out[410]= {2.774283, Null}

Out[411]= {0.016654, Null}

Out[412]= True *)

For variety, here is a top-level implementation of an integer-based method. My guess is it is similar to Zmmermann's, but he may well have had some extra efficiencies. The idea is to split the number into an upper and lower part, using a power of 4 for the split size so that we can shift back by a factor of 2. That is, write a = 4^n*b+c. Recursively compute the integer sqrt of b, multiply by 2^n, and use the usual Taylor approximation to get a correction that estimates sqrt(a). Last step is to iteratively repair that estimate. We use integer multiplication, squaring, and the integer Quotient function. The recursion has to bottom out so we provide some base size below which we use the numeric approximation method noted already.

iSqrt[a_, baselen_] := Catch[Module[
   {quarterscale = Ceiling[RealExponent[a, 2.]/4], aUpper, aLower, 
    sqrt, diff},
   If[quarterscale < baselen, Throw[Floor[Sqrt[N[a, 2*baselen + 4]]]]];
   aUpper = BitShiftRight[a, 2*quarterscale];
   aLower = a - BitShiftLeft[aUpper, 2*quarterscale];
   sqrt = BitShiftLeft[iSqrt[aUpper, baselen], quarterscale];
   sqrt += Quotient[aLower, (2*sqrt)];
   diff = a - sqrt^2;
   While[Abs[diff] >= 2 sqrt + 1 || diff < 0,
    sqrt += Quotient[diff, (2*sqrt)];
    diff = a - sqrt^2;
    ];
   sqrt
   ]]

This is not as fast as the numerical approximation method but it is not bad either. Below is on my slow laptop.

n = 10^1000000 - 3^2095903;
AbsoluteTiming[sn1 = Floor[Sqrt[n]];]
AbsoluteTiming[sn2 = floorSqrt[n];]
AbsoluteTiming[sn3 = iSqrt[n, 10];]
sn1 == sn2 == sn3

(* Out[642]= {7.20206, Null}

Out[643]= {0.0479314, Null}

Out[644]= {0.128822, Null}

Out[645]= True *)

So it's around 2.5 times slower than the numeric code. An internal implementation might do better in terms of avoiding some amount of overhead, and might possibly make up a good chunk from that factor of 2.5.


Not directly addressing the question about GMP library, but you can get a Sqrt much faster starting with an extended precision float.

n = 10^1000000 - 3^2095903;
(g = Floor[Sqrt[n]]) // AbsoluteTiming // First

6.22126

(h = Floor[Sqrt[N[n, Log[10, n]]]]) // AbsoluteTiming // First

0.086576

g == h

True

using this you maybe want a validity check, (h + 1)^2 > n > h^2 takes very little time.