Detecting perfect squares faster than by extracting square root

See the paper by Bernstein, Lenstra, and Pila: Detecting Perfect Powers by Factoring into Coprimes, Mathematics of Computation, Volume 76, #257, January 2007, pp. 385-388. Or here.

From the abstract: This paper presents an algorithm that, given an integer n>1, finds the largest k such that n is a kth power.

The algorithm runs in time $\log(n)(\log\log(n))^{O(1)}$.


Here is a probabilistic test for squaredness that can achieve an arbitrarily small error with time complexity very close to $\mathcal{O}$(log $n$) base $r$. It's sort of a generalization of the approach used in the GMP library.

Premise 1: If an integer $n$ is a perfect square and $P$ is an odd prime < N, then $n$ (mod $P$) is a quadratic residue modulo $P$.

Contrapositive: If $n$ (mod $P$) is a quadratic non-residue modulo $P$, then $n$ is not a perfect square.

Premise 2: Given an odd prime modulus $P$, there are $\frac{(P+1)}{2}$ quadratic residues (including 0) and $\frac{(P-1)}{2}$ quadratic non-residues. So there's about a 50% probability that a random integer $n$ is a quadratic residue (mod $P$).

We can infer the following. Given an integer $n$:

  • If $n$ is a perfect square, then it will pass Euler's criterion for 100% of odd primes coprime to $n$ (see footnote [1]).
  • If $n$ is not a perfect square, then it will fail Euler's criterion for roughly 50% of odd primes.

So we can construct a robust probabilistic test in the same style as the Fermat primality test. We pick a prime $P_{0}$ and check Euler's criterion for $n$ (mod $P_{0}$).

  • If $n$ is a quadratic non-residue (mod $P_{0}$), then we have our answer definitively, i.e. $n$ is not a perfect square.
  • On the other hand, if $n$ is a quadratic residue (mod $P_{0}$), then $n$ may or may not be a perfect square. But we pick more primes $P_{i}$ and perform the test repeatedly.

Each time $n$ passes as a quadratic residue, the intersected probability of $n$ being non-square decreases by 50%, so that it takes only 10 tests to determine that $n$ is a perfect square with >99.9% probability[2], regardless of its size. In other words, if $n$ is not a perfect square, then at least one of the tests will have failed before that point (with >99.9% probability).

The key here is that non-squares always fail the criterion at a ~50% rate (tested against various primes). So if after a dozen-or-so such tests $n$ hasn't failed the criterion even once, there is a very high probability that it is a perfect square. I realize I'm seriously lacking the proper Bayesian terminology here, but this works and you can try it. The error probability can be made arbitrarily small by testing against more primes.

Time complexity: Since the choice of primes $P_{i}$ need not depend on $n$, determining the exponents for the tests takes $\mathcal{O}(1)$ I think. And since the desired confidence level is also independent of $n$, the overall time complexity seems equivalent to that of modular exponentiation. Using the method of exponentiation by squaring with an efficient multiplication algorithm such as k-way Toom-Cook or Schonhage-Strassen gives an overall time complexity very close to $\mathcal{O}$(log $n$) base $r$, depending on parameters chosen. See the linked Wikipedia articles for details.

[1]: Euler's criterion requires that $n$ and $P$ are coprime; if they are not, then $n^{\frac{P-1}{2}}\equiv 0$ (mod $P$), and the test result is discarded.

[2]: A 0.1% error rate on 1 billion integers (of any magnitude) represents around 1 million false positives, which is really bad. Theoretically, 30 primes should be used in order to yield fewer than 1 false positive for every billion integers tested, but in practice I've found that just the first 18-20 primes is sufficient to yield none.


Update: Here's a working implementation in C with libgmp:

https://gist.github.com/jrodatus/e66d6f6b2f014f6b69543019edd23982

Example Run 1 - Residue statistics:


Test Mode

  0. Statistics for N > P being a quadratic residue mod P
  1. Probable perfect square algorithm

Enter test [0|1] 0
Number of primes: 30  
Number of tests (T): 1000000
Upper bound for N: 100000000000000000

Generating 30 primes...
Done: P_max=127

================================================================================
We will test T=1000000 random integers N, where:

    P_max = 127 < N <= 100000000000000000

Each row P shows the number of these N's that were quadratic residues (mod P),
written as a fraction of T.

If ~50% of randomly-chosen N's were residues (mod P),
that suggests a probability of 50% that a given N > P will be a residue.

Consistent with the Law of Large Numbers,
a large T (e.g. >1000) is needed to converge to this result.
================================================================================
  P   fraction of N's that were residues
----------------------------------------
  3   0.332975000 (332975/1000000)
  5   0.400004000 (400004/1000000)
  7   0.429508000 (429508/1000000)
 11   0.455161000 (455161/1000000)
 13   0.461268000 (461268/1000000)
 17   0.469941000 (469941/1000000)
 19   0.473873000 (473873/1000000)
 23   0.478612000 (478612/1000000)
 29   0.482535000 (482535/1000000)
 31   0.483060000 (483060/1000000)
 37   0.487377000 (487377/1000000)
 41   0.487871000 (487871/1000000)
 43   0.489587000 (489587/1000000)
 47   0.489378000 (489378/1000000)
 53   0.490486000 (490486/1000000)
 59   0.491197000 (491197/1000000)
 61   0.491372000 (491372/1000000)
 67   0.492079000 (492079/1000000)
 71   0.493632000 (493632/1000000)
 73   0.494255000 (494255/1000000)
 79   0.493439000 (493439/1000000)
 83   0.494264000 (494264/1000000)
 89   0.495205000 (495205/1000000)
 97   0.494529000 (494529/1000000)
101   0.494480000 (494480/1000000)
103   0.494896000 (494896/1000000)
107   0.494841000 (494841/1000000)
109   0.496449000 (496449/1000000)
113   0.494854000 (494854/1000000)
127   0.495656000 (495656/1000000)

(Scroll the code box to see all the prime rows, up to P=127.)

The ~30% statistic for P=3, is simply because 3 has only 1 quadratic residue (namely, 1).


Example Run 2 - Determining perfect squares:


Test Mode

  0. Statistics for N > P being a quadratic residue mod P
  1. Probable perfect square algorithm

Enter test [0|1] 1
Number of primes: 20
Number of tests (T): 10000000
Upper bound for N: 10000000000000000

Generating 20 primes...
Done: P_max=73

Testing 10000000 random values of N, 73 < N <= 10000000000000000...
# primes          : 20
# N's tested      : 10000000
# false positives : 23
success rate      : 0.999997700

Since this is on the math forum and not one of the many programming forums, I am going to give a pure mathematical answer. If you want my answer to a programming version of this question, in Python, see the stackoverflow site and/or visit the comments below.

One thing you could do to save time and effort is to eliminate the number from consideration as a perfect square by verifying quickly that it isnt one. What I mean is, Im not going to extract the root, nor am I going to verify if a number is a perfect square, but I am going to verify that a number is NOT a perfect square. Some of these hints are almost effortless, you can run these as a precursor to any more complicated algorithm. After all, it makes no sense wasting time and effort on a complicated algorithm when you can prove a number is not a perfect square with a simpler one.

So, you need to know some of the cool and interesting properties of perfect squares, if you dont already.

Firstly, any perfect square ending in 0, or a set of zeros, must contain an even number of terminating zeros. So if the number of zeros trailing the least significant digits of an integer are in odd quantity, it is not a perfect square. 57,000 is not a perfect square. If there is an even number of zeros, you can ignore them entirely and reduce your test number to the digits that precede the string of zeros - we can test 640,000 and 820,000 for perfect squareness by testing just the 64 and 82.

A similar thing can be done in binary. If dealing in programming, you can easily test for factors of $4=2^2$ and scale down using bit-wise operations. In Python I prefer the n&3 == 0 conditional test, and the n >> 2 operation. This is effectively the same thing as the previous test, except in digital/binary logic where the base is 2 instead of 10.

Secondly, all perfect squares end in the numbers 0, 1, 4, 5, 6, or 9. You can ignore the 0 if you observe rule number 1 first. This is a necessary condition. Recognize this is a mod 10 operation. So, if your test number ends (units digit) in a 2, 3, 7, or 8, this is sufficient to say that the number is not a perfect square. For example, the number 934,523 is obviously not a perfect square. See that? With this rule we've already eliminated two-fifth of all possible numbers.

The last two digits of a test number cannot both be odd. 34,833,879 is not a perfect square because both 7 and 9 are odd.

If the test number ends in a 1 or a 9, the two-digit number preceding it HAS to be a multiple of 4. Examples include 81, and the number 57,121 (because 5,712 is a multiple of 4). Numbers like 24621 are not perfect squares because 62 is not a multiple of 4.

If the test number ends in a 4, the digit preceding it has to be even. If not even then not a perfect square. 23,074 is not a perfect square.

If the test number ends in a 6, the digit preceding it has to be odd. If not odd then not a perfect square. 56,846 is not a perfect square.

If the test number ends in a 5, the digit preceding it has to a 2. Furthermore, the digit(s) preceding that 2 has to be either a 0, another 2, or the digits 06 or 56. The number 331,625 is not a perfect square.

Now for some modulo arithmetic. These can be done on paper or in your head and so dont require computers.

A perfect square must be equivalent to 0, 1, or 4 in mod 8. If not, you know you dont have a perfect square. But if you do have a perfect square, the 0,1,or 4 provides you with useful information about the square root. If you get a 1 in mod 8 then your root is odd, if 0 in mod 8 the root is a multiple of 4, if 4 in mod 8 then the root is just even but not a multiple of 4.

Speaking of mod 8 tests, in binary logic we can employ bit-wise operations. If expressed in binary, the smallest three digits of a perfect square always ends in 001, i.e. n&7 == 1. This is true after having factored out all powers of 4 (test 1 binary variation), otherwise n&3 == 0 could be true too.

There is the mod 9 test. Results have to be 0, 1, 4, or 7 in mod 9. If not (2,3,5,6,8) then you dont have a perfect square. The number 56,430,143 is not a perfect square; I know because the 56,430,143 % 9 = 8. Alternatively, look up "digital root" (sum of the digits, repeated, to a singular value), essentially the same as the mod 9 test. If your value is 2, 3, 5, 6 or 8 then it is not a perfect square, but it could be if you have 1,4,7,9.

In mod 13, all perfect squares are equivalent to 0,1,3,4,9,10,12; and in mod 7 they must be equivalent to 1,2,4. FYI.

Also, observe that $(n+1)^2 = n^2 + 2n + 1$. Clearly if your test number $N\in (n^2,n^2+2n+1)$ then it lies between two consecutive perfect squares. If $n,n^2$ are known, for some largest $n^2<N$, it would be pretty easy to reject any value in this interval.

At this point, if your test number has not failed any of the tests, then and only then would I put the resources into root extraction or other complicated algorithms. These tests above use little more than comparison, conditionals, counting, and single-digit additions, some can even be done with bit-logic.

I hope this information helps you and others wanting guidance on this.

Id also like to point out that any prime factor of your test number that comes in an odd multiplicity is also not a perfect square. This is a more time consuming approach though. You need only check primes between 2 and sqrt(n). If you find a prime that divides into n, but does so only an odd number of times, you do not have a square number. Take the perfect square 99,225. Its prime factored into [3,3,3,3,5,5,7,7], with an even number of each. Whereas the non-square 55,125 is prime factored into [3,3,5,5,5,7,7], with an odd number of 5's.

Also, it may be reasonable to scale down your number by factoring out any square factor you find. You can have a pre-computed list of primes and(?) their squares waiting. Doing this could reduce the size of your test number, improve speed, etc. Every time you successfully reduce a number, you arrive at something smaller to test. Take the aforementioned 99,225 and 55,125. If you know that 3 goes into each, try to divide out 9. You get 11,025 and 6,125. Attempt to reduce by a factor 9 again and you get 1,225 and 6,125, respectively, and 9 wont go into either any more. Next try 5. Attempt to reduce by 25 and you get 49 and 245, respectively. So on. We need only test 49 and 245 for squareness. The former clearly is; the latter is not, because it can be divided by 5 once but not twice.

On a side note, once you factor out a value, it may be worthwhile to revisit some of the prior rules and tricks, as some of the patterns may emerge and reveal information.

Another tidbit is that all integers can be factored into its integer factors, including 1 and itself. If this list comprises of only unique factors then this rule applies. Non-perfect squares have an even number of factors because they come in pairs, one on either side of the square root. But perfect squares have an odd number of unique factors, since its square root is counted once. Unfortunately this test is a bit useless since it entails finding a list of factors which include the square root itself. Take the perfect square 9 for example. Its integer factors are [1,3,9]. The 3 is the square root, but there are an odd number of terms in the list. The non-square 10, however, has integer factors [1,2,5,10].