Number theory in symmetric cryptography

The non-linearity in the block cipher AES comes from the pseudo-inversion function on the finite field $\mathbb{F}_{2^8}$, defined by

$$ p(x) = \begin{cases} x^{-1} & \text{if $x \not=0$} \\ 0 & \text{if $x=0$.}\end{cases} $$

It is a nice exercise to show that $p$ is as strong as possible against the difference attack. That is, given any non-zero $\Delta \in \mathbb{F}_{2^8}$, the function $Dp_\Delta : \mathbb{F}_{2^8} \rightarrow \mathbb{F}_{2^8}$ defined by

$$Dp_\Delta(x) = p(x) + p(x + \Delta) $$

takes $2^7-1$ different values, and is $2$ to $1$, except for an exceptional set of size $4$, namely $\{0,\Delta,\beta\Delta,(1+\beta)\Delta\}$ where $\beta$ is a solution to $\beta^2+\beta+1 = 0$, all of whose elements are sent to $\Delta^{-1}$. Not especially deep, but it's a nice application of the theory of quadratic equations in fields of characteristic two, so arguably number-theoretic. (Anyway I like it, because I discovered it for myself when asked to lecture undergraduate cryptography.)

The linear cryptanalysis of AES, by approximating the AES functions with $\mathbb{F}_2$-linear maps suggested by the Discrete Fourier Transform, seems to be somewhat trickier: see for instance this paper by Kenichi Sakamura, Wang Xiao Dong and Hirofumi Ishikawa.


Here are a few interesting examples of symmetric primitives whose claimed security is/was based on number-theoretic problems:

  1. From the 1980s: the famous Blum-Blum-Shub deterministic random bit generator is a classic example. Let $N = pq$ be the product of two large safe primes, and consider the sequence defined by $x_{i+1} = x_i^2 \pmod{N}$, where $x_0$ is the random seed (which can be any value in $(\mathbb{Z}/N\mathbb{Z})^\times\setminus\{1\}$). After each squaring, you extract some of the bits of $x_i$ to form the pseudorandom stream. The security of the bit generator - that is, the indistinguishability from a uniform random stream - can be reduced to number-theoretic problems. The idea is that if you only take the least significant bit of $x_i$ (or up to $O(\log\log N)$) at each iteration, then breaking this generator reduces to solving the Quadratic Rediduosity Problem $\bmod N$.

  2. A second classic example (this time from the 1990s): the KN cipher (Knudsen-Nyberg) was a number-theoretic block cipher designed specifically to resist differential cryptanalysis. The cipher was applied to 64-bit blocks, and the round function was defined as follows: choose a basis of $\mathbb{F}_{2^{37}}$ where the operation $x \mapsto x^3$ is particularly efficient. Let $E: \mathbb{F}_{2}^{32}: \to \mathbb{F}_{2^{37}}$ be some affine map, and let $F: \mathbb{F}_{2^{37}} \to \mathbb{F}_{2}^{32}$ be the map defined by cubing in $\mathbb{F}_{2^{37}}$, followed by throwing away five coefficients of the polynomial representation (w.r.t. the "nice cubing" basis). Now, dividing the 64-bit cipher state into two 32-bit values $L$ and $R$ in $\mathbb{F}_2^{32}$, the round function is $(L,R) \mapsto (R,L+F(E(R)+K))$, where $K \in \mathbb{F}_{2^{37}}$ is the secret key. The nonlinearity of the cubing permutation is important. The KN-cipher was subsequently broken using higher-order differential cryptanalysis, but its ideas have proven influential: the more recent MiMC cipher, for example, revisits the KN-cipher targeting applications in multi-party computation and zero-knowledge proofs.

  3. An example from the 2000s using "deeper" results in number theory: the Charles-Goren-Lauter hash function. Here we consider the $2$-isogeny graph of supersingular $j$-invariants over a suitably large $\mathbb{F}_{p^2}$: this is an important example of a Ramanujan graph, and this is key to the construction. The bits of the message $(m_0,m_1,\ldots,m_n)$ drive a non-backtracking walk of length $n$ in the isogeny graph (which is $(2+1)$-regular, so at each step you have $2$ choices: "low" or "high" w.r.t. some ordering on $\mathbb{F}_{p^2}$, and you go "low" if $m_i = 0$ and "high" if $m_i = 1$). The final hash value is a projection of the ending point $j_n$ of your walk into $\mathbb{F}_p$. The security of the hash function reduces to problems connected with finding cycles in the isogeny graph, which are provably large.

  4. Edit (I forgot one of my favourites): Wegman-Carter authenticators, which give high-performance MACs (message authentication codes) with information-theoretic security. Here, take a $\ge k$-bit finite field $\mathbb{F}_q$ and fix an inclusion $\iota: \{0,1\}^k \to \mathbb{F}_q$ (everything will operate on $k$-bit chunks of data) and a mapping $\pi: \mathbb{F}_{q} \to \{0,1\}^t$ (this will produce a $t$-bit MAC). For each $n > 0$, we can define a map $(\{0,1\}^k)^n \to \mathbb{F}_q[X]$ by $$M = (M_1,\ldots,M_n) \mapsto f_M(X) := \iota(M_n)X^n + \cdots + \iota(M_1).$$ Now to produce (and verify) an authenticator for a message $M$ given a shared secret $(R \in \{0,1\}^k, S \in \{0,1\}^t)$, we compute $T = f_M(R)\oplus S$ (where $\oplus$ denotes XOR in $\{0,1\}^t$). A crucial part of the security argument depends on the distribution of evaluations of polynomials over finite fields (see e.g. Bernstein 2005 for an up-to-date description and analysis of this).

In all four examples, number-theoretic arguments are used to give strong justifications for the security of the primitive. But the last example is important because it is also used in practice: the Wegman-Carter construction can be seen in GHASH, which is used in AES-GCM (in this case, $q$ is a power of $2$), and it is also the basis of Poly1305, a high-speed software authenticator. AES-GCM and ChaCha20-Poly1305 are two state-of-the-art algorithms for Authenticated Encryption that are widely used on the internet today.


The book Stream Ciphers and Number Theory by Cusick, Ding and Renvall is devoted to this topic, stream ciphers being one kind of symmetric cipher. I give some examples from there that are not that well known.

One security measure for a keystream output by a stream cipher is its linear complexity, i.e., the lowest order linear recurrence which it satisfies. This is usually obtained by the Berlekamp Massey algorithm applied to the output, and must be high with respect to the period of the sequence, since Berlekamp Massey is an efficient recursive algorithm.

The sphere complexity of a sequence is a generalization; it is the minimal value of the linear complexity, if an adversary can flip $k$ bits of the sequence?

A basic result that is used in this text is the following.

Let $N=p_1^{e_1}\cdots p_t^{e_t},$ where $p_i$ are $t$ pairwise distinct primes, and $q$ is a positive integer (power of a prime) such that $\gcd(q,N)=1.$ Then for each nonconstant sequence $s$ of period $N$ over $GF(q)$, $$ L(s)\geq \min\{ord_{p_1}(q),\ldots,ord_{p_t}(q)\} $$ and $$ SC_k(s)\geq \min \{ord_{p_1}(q),\ldots,ord_{p_t}(q)\}, $$ if $k<\min\{WH(s),N-WH(s)\}.$ Here $WH$ is the Hamming weight of the sequence $s$ $L(s)$ is its linear complexity, $SC_k(s)$ is its sphere complexity under $k$ bitflips, and $ord(\cdot)$ denotes multiplicative order.

Also note that one can define a power generator in $\mathbb{Z}_{pq}$ via choosing an initial setting $a_0 \in \mathbb{Z}_{pq}$ and letting $a_{t+1} = a_t^d \pmod N.$ For $d=2,$ this is the Blum Blum Shub generator, and has some nice security properties if $p,q$ are both congruent to 3 modulo 4, though a bit slow to be used directly as a keystream in modern symmetric cryptography. One can prove that if we only take the least significant $k$ bits of each $a_t$ as an output block of bits, provided $k\leq \log N,$ breaking this keystream (determining the initial loading) is equivalent to factoring $N.$

The classical theory of binary Linear Shift Register Sequences and their nonlinear filterings, as pioneered by Golomb in his book Shift Register Sequences and extended further is another example, however this is not explicitly or deeply number theoretic in nature, in my opinion.