Is there a mistake in the SEP article about Godel's Incompleteness theorems?

You are correct. The Stanford article is hopelessly confused and contains major errors.

The principle of Gödel numbering is that one can have a relationship between numbers that corresponds to a relationship between symbol strings of a formal system, provided that the numbers are Gödel numbers of the given symbol strings. Similarly, one can have a function that corresponds to an operation on symbol strings of the formal system.

Given the substitution of the free variable of a symbol string X by a symbol string Y, then there is a corresponding number-theoretic function with the variables x and y, where x = ⌈X⌉ and y = ⌈Y⌉, provided X is a valid formula of the system with one free variable. This function can also be expressed in the formal system (provided certain conditions are met).

But the crucial point that the Stanford article misses is that Y can be any string of symbols of the formal system.

If you examine Gödel's own proof, the number-theoretic relation that he uses and which corresponds to the substitution of a free variable of a formula of the formal system is Sb(x, v, y) (Gödel's relation/function 31) where, if v refers to a free variable, and x is the Gödel number of a formula X, and y is the Gödel number of some symbol string Y of the formal system, then Sb is a function that gives a number that is the Gödel number of the symbol string that results from the substitution of the free variable of X by the symbol string Y. There are online step-by-step guides to Gödel's proof that explain this in detail.

So, when the Stanford article states that "nothing prevents us from considering subst(⌈A(x)⌉, ⌈A(x)⌉)... ", that is correct, but that corresponds to the substitution of the symbol string that is the formula A(x) itself for the free variable x of the formula A(x). But this does not correspond to a valid substitution within the formal string, and results (in general) in a string of symbols that is not a valid formula of the formal system. This renders the remainder of the Stanford article and the subsequent use of the subst relation and the S(x, x, y) relation illogical and nonsensical.

Further down, the SEP article uses the terminology S(, , y). Since S is supposedly a purely number-theoretic relation, then the variables all have the same domain, so there is no logical reason to have two of the variables having the domain of numbers in some specific format, and the other one not confined to such a format. In any case, the claim is that the formal system can express the S function. Now, while the domain of the numeral function is natural numbers, the range of the function is symbol strings of the formal system that represent numbers. But in the formal system, all natural numbers are in the same format, so that in the formal system the numeral function would simply be the identity function.

Later in the SEP article, it is stated that the formal system can prove the formula:

y[S(k, k, y) ↔ y = ⌈B()⌉]

But the domain of the free variable of the Gödel numbering function is all symbol strings of the formal system, while this is not the case for any variable of the formal system. What we have here is an unproven assumption that the formal system can prove a formula that includes the Gödel numbering function. A valid mathematical proof does not rely on unproven assumptions, which means that the SEP article is not a valid mathematical proof.

With regard to your suggestion for a correction, you state (as many others do) that Z(n) = ⌈n⌉. But if you are attempting to prove a result dependent on this assertion, you need to prove Z(n) = ⌈n⌉ as a lemma, rather than simply asserting it. Not only that, you need to prove that the formal system itself can prove that lemma, independently of any interpretation of the formal system.

But, while the Z function is a purely number-theoretic function and therefore can be expressed in the formal system, the Gödel numbering function is a function of the meta-language and is not a number-theoretic function, and there is no formula of the formal system that can express it.

This presents an insurmountable difficulty which is commonly ignored. By side-stepping this problem, many authors are simply imposing an interpretation on strings of the formal system to the effect that they self-reference, but the self-reference is not inherent in the formal system itself.

#####################

I have added the following since there seems to some confusion regarding the numeral function.

The fundamentals of numeral functions are as follows. In a formal system, numbers are represented by a clearly defined specific format, e.g., SSSS...0, whereas in conventional arithmetic, the format that one may use for numbers is implicitly assumed, normally base ten with the symbols, 0, 1, 2, 3, ... 9, and other conventions such as scientific notation are also assumed. You could, of course, for example, use a base other than base 10, and in such a case, one would have to explicitly state you were doing so.

A numeral function normally has the domain of numbers in conventional format, and a range of numbers in some specifically defined format. Obviously, since there can be many different formats, there can be many different numeral functions. But the numeral function we are concerned with here is that in the SEP article, where the numeral function n is defined as the value that can be substituted for a variable of the formal system, i.e., the range of the variable is (as I stated) symbol strings of the formal system that represent numbers. So I do not follow the claim that I am incorrect in stating so. One could say that the range is natural numbers with the proviso that they must in the format defined for the formal system, but it amounts to the same thing (though note that one could communicate the entire information that defines the formal system without ever mentioning the concept of number at all).

When, as in the SEP article, one is referring to a formula of the formal system, the numeral function is often used to indicate that the substitution of a variable of the formal system must be by that specific format, as in A(). But as regards the subst function and the S relation, the point here is that these are arithmetical functions/relations where in the meta-language a correspondence has been established between these arithmetical functions/relations and operations/relationships of symbol strings of the formal system. But that correspondence does not alter the fact that the subst function and relation S are purely arithmetical, and can be treated as such independently of any mention of a formal system - and hence there is no reason to treat them differently than we would any other arithmetical relation - and in conventional arithmetic, we don't specify the format of the numbers we use unless we are using a non-conventional format for every number, in which case we would explicitly define that format. But we would certainly not mix and match different formats as is done in the SEP article. Such mixing and matching is illogical and contrary to convention and only serves to confuse the distinction between formulas of the formal system and conventional arithmetical functions/relations.


I am also having trouble determining exactly what was intended in the article. The source of my confusion is the mixture of brackets and underlining on some of the numbers (and then boldface, later on).

In the end, the function "subst" takes actual numbers as inputs. So I believe that the following revision of the SEP article seems to work, where we remember that all the inputs to subst are numbers. In my text below, $\underline n$ is the term of the form $1 + 1 + \cdots + 1 $ that represents $n$, and $\lceil A(x)\rceil$ is the Gödel number of $A(x)$.

The proof of the Diagonalization Lemma centers on the operation of substitution (of a numeral for a variable in a formula): If a formula with one free variable, $A(x)$, and a number $n$ are given, the operation of constructing the formula where the numeral for that number has been substituted for the (free occurrences of the) variable $x$, that is, $A(\underline n)$, is purely mechanical. So is the analogous arithmetical operation which produces, given the Gödel number of a formula (with one free variable) $⌈A(x)⌉$ and a number $n$, the Gödel number of the formula in which the numeral for that number has been substituted for the variable in the original formula, that is, $⌈A(\underline n)⌉$. The latter operation can be expressed in the language of arithmetic. Note, in particular, that nothing prevents $n$ from being the Gödel number of $A(x)$ itself, that is, $⌈A(x)⌉$ (in the usual coding schemes, though, $n$ cannot be $⌈A(\underline n)⌉$). This operation of substitution is applied here again and again.

Let us refer to the arithmetized substitution function as subst($⌈A(x)⌉$, $n$) = $⌈A(\underline n)⌉$, and let $S(x, y, z)$ be a formula which strongly represents this operation, as a relation, in the language of our theory $F$. In other words, $S$ is true of $x$, $y$, and $z$, if and only if there is a formula $A(r)$ with one free variable such that: $$ x = ⌈A(r)⌉ \text{ and } z = ⌈A(\underline y)⌉. $$ Again, nothing prevents us from considering subst($⌈A(x)⌉$, $⌈A(x)⌉$), or, analogously, $S(x, x, y)$.

I have read through the lower part of the SEP article a couple times, and it seems as if this change to the beginning of the SEP article is sufficient for the lower part to go through. This does require some changes, e.g. $$ F ⊢ ∀y[S(\underline k,\underline k, y) ↔ y = ⌈B(\underline k)⌉] $$ should be $$ F ⊢ ∀y[S(\underline k,\underline k, y) ↔ y = \underline {⌈B(\underline k)⌉}] $$ with similar changes in the remainder of the argument.

Of course, (as the question shows) it is easy to miss fine details.