How to think about theories that prove their own inconsistency?

When we think about theories like ZFC or PA, we often view them foundationally: in particular, we often suppose that they are true. Truth is very strong. Although it's difficult to say exactly what it means for ZFC to be "true" (on the face of it we have to commit to the actual existence of a universe of sets!), some consequences of being true are easy to figure out: true things are consistent, and - since their consistency is true - don't prove that they are inconsistent.

However, this makes things like PA + $\neg$Con(PA) seem mysterious. So how are we to understand these?

The key is to remember that - assuming we work in some appropriate meta-theory - a theory is to be thought of as its class of models. A theory is consistent iff it has a model. So when we say PA + $\neg$Con(PA) is consistent, what we mean is that there are ordered semirings (= models of PA without induction) with some very strong properties.

One of these strong properties is the induction scheme, which can be rephrased model-theoretically as saying that these ordered semirings have no definable proper cuts.

It's very useful down the road to get a good feel for nonstandard models of PA as structures in their own right as oppposed to "incorrect" interpretations of the theory; Kaye's book is a very good source here.

The other is that they satisfy $\neg$Con(PA). This one seems mysterious since we think of $\neg$Con(PA) as asserting a fact on the meta-level. However, remember that the whole point of Goedel's incompleteness theorem in this context is that we can write down a sentence in the language of arithmetic which we externally prove is true iff PA is inconsistent. Post-Goedel, the MRDP theorem showed that we may take this sentence to be of the form "$\mathcal{E}$ has a solution" where $\mathcal{E}$ is a specific Diophantine equation. So $\neg$Con(PA) just means that a certain algebraic behavior occurs.

So models of PA+$\neg$Con(PA) are just ordered semirings with some interesting properties - they have no proper definable cuts, and they have solutions to some Diophantine equations which don't have solutions in $\mathbb{N}$. This demystifies them a lot!


So now let's return to the meaning of the arithmetic sentence we call "$\neg$Con(PA)." In the metatheory, we have some object we call "$\mathbb{N}$" and we prove:

If $T$ is a recursively axiomatizable theory, then $T$ is consistent iff $\mathbb{N}\models$ "$\mathcal{E}_T$ has no solutions."

(Here $\mathcal{E}_T$ is the analogue of $\mathcal{E}$ for $T$; remember that by the MRDP theorem, we're expressing "$\neg$Con(T)" as "$\mathcal{E}_T$ has no solutions" for simplicity.) Note that this claim is specific to $\mathbb{N}$: other ordered semirings, even nice ones!, need not work in place of $\mathbb{N}$. In particular, there will be lots of ordered semirings which our metatheory proves satisfy PA, but for which the claim analogous to the one above fails.

It's worth thinking of an analogous situation in non-foundationally-flavored mathematics. Take a topological space $T$, and let $\pi_1(T)$ and $H_1(T)$ be the fundamental group and the first homology group (with coefficients in $\mathbb{Z}$, say) respectively. Don't pay attention too much to what these are, the point is just that they're both groups coding the behavior of $T$ which are closely related in many ways. I'm thinking of $\pi_1(T)$ as the analogue of $\mathbb{N}$ and $H_1(T)$ as the analogue of a nonstandard model satisfying $\neg$Con(PA), respectively.

Now, the statement "$\pi_1(T)$ is abelian" (here, my analogue of $\neg$Con(PA)) tells us a lot about $T$ (take my word for us). But the statement "$H_1(T)$ is abelian" does not tell us the same things (actually it tells us nothing: $H_1(T)$ is always abelian :P).

We have a group $G$, and some other group $H$ similar to $G$ in lots of ways, and a property $p$; and if $G$ has $p$, we learn something, but if $H$ has $p$ we don't learn that thing. This is exactly what's going on here. It's not the property by itself that carries any meaning, it's the statement that the property holds of a specific object that carries meaning useful to us. We often conflate these two, since there's a clear notion of "truth" for arithmetic sentences, but thinking about it in these terms should demystify theories like PA+$\neg$Con(PA) a bit.


If I understand correctly you problem the key to solve it is to think carefully to the concept of encoding.

For simplicity allow me to consider the case where $T'$ is PA (Peano Arithmetic).

The internalization of the syntactic properties of PA in itself uses an encoding which is roughly a mapping that associates to formulas and proofs constant terms (their encodings) and to meta-theoretical properties ("$x$ is a proof of $y$ in PA", "$x$ is provable in PA", etc) formulas in the language of $T$ in such a way the following holds:

if $RS$ is a syntactic (meta-theoretic) property and $O_1,\dots,O_n$ are syntactic objects (formulas or proofs) then $RS(O_1,\dots,O_n)$ holds if and only if $PA \vdash Enc(RS)(Enc(O_1),\dots,Enc(O_n))$, where $Enc$ is the mapping that associates to syntactic objects their encodings in $PA$'s language.

The important thing to keep in mind is that this encoding-condition is required to hold only for encodings.

Now let consider a theory $T=PA+\neg Enc(Con(PA))$ in the language of arithmetic.

Clearly $T \vdash \neg Enc(Con(PA))$ but what does this mean? By soundness and completeness this is equivalent to say that in every arithmetic structure $M$ which is a model of $T$ it must hold $M \models \neg Enc(Con(PA))$. We have that $$Enc(Con(PA))\equiv \neg \exists x\ Enc(\text{*is a proof*})(x,Enc(\bot))$$ hence $$\neg Enc(Con(PA)) \equiv \exists x\ Enc(\text{* is a proof *})(x,Enc(\bot))$$ so in each model $M$ of $T$ there is an element $m \in M$ such that $$M \models Enc(\text{* is a proof *})(m,Enc(\bot))$$ the problem is that this $m$ is not an encoding, it is not even required to be the interpretation of a constant term, hence there is no way that we could decode this term to a proof (in PA) of $\bot$.

The point is that the formula $Enc(\text{* is proof of*})$ define a relation for each arithmetic structure but it has its intended meaning only when applied to encodings: meaning that $Enc(\text{*is a proof of*})(m,n)$ expresses that $m$ is the encoding of a proof of the formula encoded by $n$ only when $m$ and $n$ are encoding.

The argument shown here should be easy to adapt to other kind of theories such as the ones you described.

I hope this helps.