Probability that a number passing the Fermat test is prime

Suppose a number $n$ is composite, and not Carmichael. That means there is some integer $x$ coprime to $n$ with the property that $x^{n-1}\not\equiv 1\pmod n$.

Now we can split all the congruence classes coprime to $n$ in the following way.

$$\{y_1,xy_1,x^2y_1,...x^{r-1}y_1\},\{y_2,xy_2,x^2y_2,...x^{r-1}y_2\},...$$ Here each set is an equivalence class of numbers under the relation that you can get from one to the other (mod $n$) by multiplying by a power of $x$; $r$ is the multiplicative order of $x$. (Since $x$ is coprime to $n$, $x^r\equiv 1\pmod n$ for some minimum $r>0$.)

Now in each set, at most half the numbers are bases for which $n$ passes the Fermat test. This is because $(x^{k+1}y_i)^{n-1}=x^{n-1}(x^ky_i)^{n-1}\not\equiv(x^ky_i)^{n-1}\pmod n$ by definition of $x$, so of any two consecutive numbers in a set (with wraparound), at most one works.

Therefore at most half of possible bases $b$ have the property that $n$ passes the Fermat test base $b$. If you choose a random base, therefore, you have at most a $50\%$ chance of passing; if you choose two random bases, $25\%$ and so on.

Of course, this assumes you know $n$ is not a Carmichael number, which in general you don't. There are better tests (e.g. Soloway-Strassen) which have this property for any composite $n$.


Could the confusion be coming from the textbook's informal use of the word "chance"?

Define these events:

  • $A: n$ is prime

  • $A^c: n$ is composite (i.e. complement of $A$)

  • $T_k:$ passes $k$ tests

Then these are true statements:

  • $P(T_k \mid A) = 1$

  • $P(T_k \mid A^c) \le 2^{-k}$

But using the normal rules of probability, you cannot conclude $P(A \mid T_k) =$ anything at all, much less the textbook's informal claim that $P(A \mid T_k) \ge 1 - 2^{-k}$.

This is a classic case of confusing probability and likelihood. To the textbook authors' credit, they did not use the word "probability" (nor "likelihood") but instead use the word "chance".

Anyway, under the normal rules of probability, you can say nothing(?) about $P(A \mid T)$ unless you have a prior (unconditional) probability $P(A)$. If you had a prior, you can use Bayes Rule:

$$P(A\mid T) = {P(A \cap T) \over P(T)} = {P(A \cap T) \over P(A \cap T) + P(A^c \cap T)} \approx {P(A) \over P(A) + P(A^c)2^{-k}}$$

We can see that, if the prior were $P(A) = P(A^c) = 1/2$ (akin to: "I have no idea if it will rain next Christmas so lets say it's 50-50"), then:

$$P(A \mid T) \approx {\frac12 \over \frac12 + \frac12 \times 2^{-k}} = {1 \over 1 + 2^{-k}} \approx 1 - 2^{-k}$$

However for any other prior, the above would not be true. In particular, if $P(A) = 0$ or $1$, then $P(A \mid T) = 0$ or $1$ (which should be obvious anyway).

This also means something more practical: if you were to pick a random $1000$-digit integer (i.e. $P(A)$ is tiny), and then do a single test, and lets say your test is so poor that $P(T \mid A^c) = 1/2$ in this range of integers, then even if you pass the test, you absolutely should not conclude there is $1/2$ chance you picked a prime... because it is much much more probable that you picked a composite and just got a deceptive test result. (Very similar to when a person gets a Positive disease test result when it is known the disease affects a very tiny fraction of the population... the test result is probably a False Positive.)

So the question is: what is $P(A)$? There is no good answer to that, because there is no (uniform) distribution over all integers. (And if you use density, then the density of primes is $0$.) That's why (IMHO) the textbook authors use the word "chance" and probably (pun intended) meant "likelihood".