How to translate mathematical intuition into a rigorous proof?

You have proved that these binary strings are symmetric so it suffices to prove the result for only the "half-strings." Imagine cutting your triangle down the middle. Let $x_n$ be the $n^{th}$ bit string, which is length $n+1$.

We have $x_0 = 1$, $x_1 = 11$, $x_2 = 101$, etc.

Now you can address the $j^{th}$ bit inside $x_n$ as $x_{n,j}$.

This addressing system will make it easier to talk more concretely about the action of the automaton.

You can pin down a definition of "locality of action" by saying that there is some function $f(a, b, c)$ such that $\forall n, j: x_{n,j} = f(x_{n-1, j-1}, x_{n-1,j}, x_{n-1,j+1})$ (with some additional nuisance conditions about the boundary). This means that the $j^{th}$ bit in the current string can be computed by neary-by bits in the previous string. Here $f$ represents the action of this rule 150 you mention.

Now use $f$ again to show $ x_{n,j} = f(f(x_{n-2, j-2}, x_{n-2,j-1}, x_{n-2, j}), f(x_{n-2, j-1}, x_{n-2,j}, x_{n-2, j+1}), f(x_{n-2, j}, x_{n-2,j+1}, x_{n-2, j+2}))$

Now you can see when we repeat this procedure $k$ times we will end up with a trinary abstract syntax tree depth $k$. To avoid having to talk about the exact structure of the expression at the $k^{th}$ level, we can reason about this substitution process purely formally as a process operating on the (countably infinite) set of symbols $\{ \underline{f}, \underline{(}, \underline{)} \} \cup \{ \underline{x_{i,j}} \}_{(i,j) \in \mathbb{N}^2}$ (here the commas are not meant to be part of the symbol set, they are just there to separate symbols from each other). You can prove by induction, that for any $k$, that $x_{n,j}$ is computed by an expression using the symbols $\{ \underline{f}, \underline{(}, \underline{)} \} \cup \{\underline{ x_{n-k, j-k}},\underline{ x_{n-k, j-k+1}}, …,\underline{ x_{n-k, j+k-1}},\underline{ x_{n-k, j+k} }\}$ (the $2k+1$ bits in the $(n-k)^{th}$ string centered on the $j^{th}$ bit) intermixed with function applications of $f$.

Fix $n$ and let $k_n$ be the distance to the previous power of $2$. (So if $n=13$, $k_n=5$ since the previous power of $2$ is $8$). And define $j_n \equiv (n-k_n)/2$. This is the index of the central bit of the $(n-k_n)^{th}$ bit string.

You have demonstrated that those bits at indices $\{ (n-k_n, m) \}_{1 \le m \lt n }$ are all zero since $n-k_n$ is a power of $2$. So any well-formed expression using only the symbols $\{ \underline {f}, \underline{(}, \underline{)}\} \cup \{ \underline{x_{n-k_n, j_n-k_n}}, \underline{x_{n-k_n, j_n-k_n+1}}, …,\underline{x_{n-k_n, j_n+k_n-1}}, \underline{x_{n-k_n, j_n+k_n}}\}$ will evaluate to zero by $f(0,0,0) = 0$, and in particular the expression representing $x_{n,j_n}$ (created from the $k_n$ iterations of the formal symbol substitution procedure) evaluates to $0$.

So as $n$ increments, $x_{n,j_n}$ traces out the path of the central bits you were referring to, all of which are zero.

General thoughts about formalization: There are certain technical devices which are work horses of the formalization process. Two of them I have used in this proof:

  • indexing time and space, with a numeric coordinate system
  • encoding a time evolving system as function application

A third device which I used is not as common, and comes from computer science and logic:

Something that I have been realizing is that these devices are not obvious (at least to me). Humanity was around for a long time before the Cartesian coordinate system was discovered and used to formalize geometrical intuition. It took even longer to develop the idea that almost every mathematical object could be encoded as an intricate tower of sets (ZFC).

I guess for me it's a matter of building up a library of these devices and binding them to the correct intuitions by repeated use.

In certain areas like computer programming, there are very few tools for converting intuition into proof. There were a couple of very good devices created like

  • loop invariants
  • Hoare logic

but multithreaded and heap-allocating programs are still an active area of research.


Here is one answer to the question. If $\, n\ge 2\, $ then the first $4$ entries in each row, namely, $\, 1 0 1 0, 1 1 0 1, 1 0 0 0, 1 1 1 0, \dots, \,$ repeat in a period of $4$. There is a $0$ in each of these first $4$ entries. This was not the proof you wanted because it is too simple.

Your proof is essentially the following. You noticed something about a $0$ in row $\, 2^n \,$ and that this $0$ continues up to the row $\, 2^{n+1}. \,$ We formalize that observation by working with polynomials over the field with two elements. Define the Laurent polynomials $\, y(n) := x^n + x^{-n} \,$ and since the characteristic of the field is $2$, $\, y(0) = 1 + 1 = 0. \,$ Using simple algebra of exponents we get that $\, y(n)y(m) = y(n+m) + y(n-m), \,$ and the special cases $\, y(n)^2 = y(2n) \,$ and $\, y(2^n) = y(1)^{2^n}. $

Define the row polynomials $\, r(m) := (1 + y(1))^m. \,$ The general row is $\, r(m) = r(2^n+k) \,$ where $\, 0\le k < 2^n. \,$ Notice that $\, r(2^n) = 1 + y(1)^{2^n} = 1 + y(2^n). \,$ Also $\, r(m) = 1 + \sum_{i=1}^m c_i y(i), \,$ where $\, c_i \,$ is the coefficient of $\, y(i). \,$ You noticed that $\, c_{2^{n-1}} = 0 \,$ but how to prove it? We calculate that $$\, r(m) = (1 \!+\! y(2^n)) r(k) = (1 \!+\! y(2^n))(1 \!+\! cy(2^{n-1}) \!+\! t) = (1 \!+\! y(2^n))(1 \!+\! t) \!+\! c y(3\,2^{n-1}) \,$$ where $\, t \,$ is all the terms with $\, y(i), \,$ and $\, i\ne 2^{n-1}, \,i < 2^n. \,$ But $\, y(2^n)y(i) = y(2^n-i) \!+\! y(2^n+i) \,$ and since $\, i \ne 2^{n-1} \,$ we get no $\, y(2^{n-1}) \,$ terms in the product $\, (1 \!+\! y(2^n))(1 \!+\! t). \,$ That proves it.

This answers your particular question, but what about intuition versus rigor? I think there are no good answer to that question. You have to always look for any patterns or regularities and simple or edge cases. You have to be lucky and it helps to have a lot of background experience to draw upon.