Representing mathematical statements as SAT instances

The closest thing to what you're looking for may be the work of Adam Yedidia and Scott Aaronson, who have, for example, constructed an explicit 5372-state Turing machine that halts if and only if the Riemann Hypothesis is false.

This doesn't quite satisfy your requirements because Yedidia and Aaronson were interested in the halting problem rather than SAT, and it's not completely trivial to convert what they've done into the SAT-generator you're asking about. In their preliminary discussion of the problem of constructing a Turing machine whose halting behavior is unprovable in ZFC (assuming ZFC is consistent), they make the following remarks.

One way to build this machine would be to start with the axioms of ZFC and apply the inference rules of first-order logic repeatedly in each possible way so as to enumerate every statement ZFC could prove, and to halt if ever a contradiction was found. While this method is conceptually simple, to actually construct such a machine would lead to a huge number of states, because it would require writing a program to manipulate the axioms of ZFC and the inference rules of first-order logic, and then compiling that program all the way down to Turing machine states.

Though they don't state so explicitly in their paper, they asked around and found that nobody seemed to have actually implemented such a machine. For example, the Mizar proof assistant does implement the axioms of set theory and the inference rules of first-order logic, but it is set up to streamline the process of formalizing proofs, not to produce Turing machines or SAT instances of the type you're interested in. I think it would not be too difficult for someone proficient in Mizar to produce the code you're looking for, but your question was not whether this could be done in principle, but whether it had already been done, and I believe that the answer is no—not because it is difficult, but because nobody so far has been motivated to do so.

Another natural place to look are the SAT competitions, but there isn't anything like what you're asking for among the benchmarks.


Improving on Brumleve's answer, I have a method for encoding a length-$n$ proof with a quasilinear $\tilde {O} (n)$-bit 3SAT instance.

In most formal systems, proofs and objects that appear in proofs (propositions, formulas, etc.) have a tree-like structure in that each such object can be built from other such objects in an enumerated list of ways. For example the proposition $P \to Q$ is built out of the two propositions $P$ and $Q$, and a proof of $Q$ can be made with modus ponens out of a proof of $P \to Q$ and a proof of $P$. Such a proof-tree can be encoded with a pointer architecture, as some sequence $x_0, \dots, x_{n-1}$ where each $x_i$ encodes a formula or a proposition or a proof of a proposition, possibly encoding including references to earlier objects $x_j$, $j < i$. For example, if $x_k$ encodes the proposition $P \to Q$, which I will denote with the notation $k \mapsto (P \to Q)$, then $x_k$ might have information of the form $(\to, i, j)$ where $i \mapsto P$ and $j \mapsto Q$. Similarly, denoting $k \vdash Q$ for $x_k$ encoding a proof of $Q$, a proof $k \vdash Q$ by modus ponens might be given in the form $(\mathtt {ModusPonens}, i_0, i_1, i_2, i_3, i_4)$ where $i_0 \mapsto P$, $i_1 \mapsto Q$, $i_2 \mapsto (P \to Q)$, $i_3 \vdash P \to Q$, and $i_4 \vdash P$.

In addition, if $x_j$ includes references $i_0, \dots, i_{k-1}$ to earlier objects, define $y_j = (x_{i_0}, \dots, x_{i_{k-1}})$ as a description of all the objects referenced by $x_i$.

To check if the objects $(x_0, \dots, x_{n-1})$ with the supplementary information $(y_0, \dots, y_{n-1})$ encodes a proof of the Riemann Hypothesis, you must check the following conditions:

  • (Local validity) Each $x_i$ represents a valid construction of a proof object out of earlier proof objects. For example if $x_i = (\to, j, k)$ then it is necessary to check that $x_j$ and $x_k$ encode propositions and not some other kind of object. For steps such as modus ponens where it is necessary to check the syntactic equality of two subexpression, check for pointer equality of the corresponding objects instead. This change does not affect which statements we can prove in a given number of steps since it is always possible to deduplicate so that each syntactic expression appears at most once in the list $(x_0, \dots, x_{n-1})$, and this can only shorten the proof. If $x_i$ is constructed with enough references then checking the local validity of $x_i$ only requires $O (\log n)$ bit-operations on the inputs $(x_i, y_i)$, for a total of $O (n \log n)$ .

  • (Non-cyclicness) If $x_j$ contains references $i_0, \dots, i_{k-1}$ then we have $i_0, \dots, i_{k-1} < j$.

  • (Right conclusion) $x_{n-1}$ encodes a proof of the Riemann Hypothesis. One way to check this is to fix $x_0, \dots, x_{k-1}$ to hardwired constant values for some $k = O (1)$ so that $x_{k-1}$ formulates a statement of the Riemann Hypothesis, and ask that $x_{n-1}$ encodes a proof of $x_{k-1}$.

  • (Correctness of $(y_j)$) The values $y_k$ correctly dereference the references given in $x_k$. The rest of my answer explains how to check this.

We've reduced the problem of checking proof validity to the following reference-checking problem: Given an array $x_0, \dots, x_{n-1}$ of $\ell$-bit values and a list $(k_0, y_0), (k_1, y_1), \dots, (k_{m-1}, y_{m-1})$ where each $k_j$ has $\lceil \log_2 n \rceil$ bits and each $y_j$ has $\ell$ bits, check that $x_{k_j} = y_j$ for all $j < m$. I claim that this can be done with $\tilde {O} ((n + m) \ell)$ additional variables and constraints.

First of all, we may append $(0, x_0), (1, x_1), \dots, (n-1, x_{n-1})$ to the list $((k_j, y_j))$. Therefore we may assume without loss of generality that $m \geq n$ and $(k_i, y_i) = (i, x_i)$ for $i < n$. Moreover, it is sufficient to check that $((k_i, y_i))$ is self-consistent: That if $k_i = k_j$ then $y_i = y_j$. This is checked by sorting $((k_i, y_i))$: Let $(\tilde {k}_0, \tilde {y}_0), \dots, (\tilde {k}_{m-1}, \tilde {y}_{m-1})$ be a permutation of $(k_0, y_0), \dots, (k_{m-1}, y_{m-1})$ with $\tilde {k}_0 \leq \tilde {k}_1 \leq \dots \leq \tilde {k}_{m-1}$ (this is checkable in $\tilde {O} (m \ell)$ variables and constraints). Then it suffices to check the that if $\tilde {k}_{i+1} = \tilde {k}_i$ then $\tilde {y}_{i+1} = \tilde {y}_i$. This requires $O (m (\ell + \log n))$ bits and constraints.


I'll try to answer part 2 with a reasonable exponent.

I claim that a proof can be verified in $\tilde{O}(n^2)$ time on a multi-tape Turing machine: define a proof to consist of a list of statements such that each statement is an axiom or a tautology recognizable in essentially quadratic time (including instances of all the usual rules of inference expressed as implications), a conjunction $A \land B$ of two earlier statements $A$ and $B$, or the conclusion $B$ of an earlier instance of modus ponens $A \land (A \rightarrow B)$. I believe the adequacy of this definition follows from the completeness theorem, but I've left open exactly what constitutes a recognizable tautology or axiom schema instance and how to recognize it. This certainly depends on the theory, since simply being effectively axiomatized doesn't mean we can recognize the axioms quickly enough, but I believe this can be accomplished for theories like $\text{PA}$ and $\text{ZFC}$ on a multi-tape Turing machine in essentially quadratic time.

Without going into details, the verification proceeds one statement at a time from beginning to end, scanning backwards to find the earlier statements, which should make for an essentially quadratic time algorithm. Translated to a single-tape Turing machine this algorithm runs in time $\tilde{O}(n^4)$. Now we can rephrase the search for a proof as a bounded halting problem on a non-deterministic single-tape Turing machine yielding a $\tilde{O}(n^8)$-bit instance of $\text{SAT}$ according to the linked paper.

There is a quadratic lower bound on the time required to verify a proof on a single-tape Turing machine, which can be proved by a crossing-sequence argument applied to the problem of recognizing any formula that substitutes the same formula more than once, for example $A \lor \neg A$. So we won't be able to get better than $O(n^4)$ bits using the same reduction to $\text{BHNTM}$. It's unclear to me whether or not it's possible to achieve essentially quadratic verification with one tape, but with multiple tapes it seems straightforward enough, and that suffices for the $\tilde{O}(n^8)$-bit upper bound.

Assuming $\text{P} \ne \text{NP}$, no polynomial-time reduction to an $o(n)$-bit $\text{SAT}$ instance is possible. That's because there's a polynomial-time reduction in the other direction, from $\text{SAT}$ instances to $O(n)$-bit proof searches. If one existed we could make a sufficiently large $\text{SAT}$ instance at least one bit smaller in polynomial time by transforming it into a proof search and back again. Likewise, $\text{ETH}$ implies there's no $2^\text{o(n)}$-time reduction to an $o(n)$-bit $\text{SAT}$ instance.

For part 3, I know that RH has been reduced to a $\Pi_1$ sentence, and the instances of this sentence have essentially linear proofs, so a refutation won't be much longer than a counterexample and we can use the above method. Of course this is not at all a practical approach as with discrepancy, unless the refutation turns out to be much shorter than the counterexample.