What does proving the Collatz Conjecture entail?

Adding to @Adam B.'s answer: examinig the conditions of possible cycles leads to the relations of powers of 3 and powers of 2, focusing on problems which are still not solved either.

One can look at it in terms of approximation : what is the smallest difference between perfect powers of 3 and perfect powers of 2, relative to the magnitude of one of them or of the rational approximation of log(3)/log(2) where we find an unsolved detail in the Waring-problem (see mathworld, "power fractional parts"). Some first steps to the proof of nonexistence of cycles (in the positive integers, in the negative integers we have at least 3 additional cycles) were provided by Ray Steiner 1996 and later by John Simons and Benne de Weger who proved the nonexistence of a certain type of cycles using that rational approximation approach.

Or one can look at the problem of cycles in terms of modular conditions, and arrive at other unsolved properties of the relation of powers of 3 to powers of 2. There is, for instance, the formulation in terms of "z-numbers" done by Kurt Mahler.

Unfortunately, even if that 3/2-problems were solved, that would not mean that the Collatz-problem was also solved and vice versa; for instance the solution of the Waring-problem-detail would only solve the "1-cycle" problem but not the general "m-cycle" problem with m bigger than roughly 70 (using the notation of Simons/De Weger): the mentioned conditions are not including each other. (The Steiner/Simons/De Weger articles are linked in the wikipedia, a more basic, amateurish article of mine adressing these aspects a bit less cryptic can be found here )


I think, for the question "which approach should be proposed to attack the problem", the best answer is "read Lagarias". Lagarias has a very good overview about many various attempts to the handling of the problem and gives also comments.


Just a simple one, but which can at least lead to the disproof of the so-called "1-cycle","2-cycle" ... up to "68-cycle" as named by John Simons/Benne de Weger, so this approach at least showed to be not completely useless:

Let's denote a step $a_{k+1}= (3a_k+1)/2 $ as "u" and one step $a_{k+1} = a_k/2$ as "d" for simpler symbolical notation. If at a certain first number $a_0$ such steps follow as "uududdud...dud" and if the resulting $a_n$ is then equal to $a_0$ then we have a cycle.

Let's now denote a sequence of arbitrarily many equal operations "uuuu...u" as "U" and accordingly "D" . So beginning at $a_0=1$ followed by a "ud" results in $a_2=1$ forming a cycle, and this is also "UD" with $\small \text{length}(U)=\text{length}(D) = 1$ . First Ray Steiner (see references at the wikipedia-entry) looked at the question whether cycles of the form "UD" exist besides that $1 \to 1$ cycle (which he calls the "trivial cycle" or "circuit") . He could prove that indeed there is no non-trivial cycle of this form - but although the attempt is simple, the tool to prove the nonexistence needed heavy machinery from transcendence number theory... John Simons and Benne de Weger could extend the method to disprove also the existence of "UD UD", "UD UD UD" cycles, and to introduce names they gave that general forms the names "1-cycle" for "UD", "2-cycle" for "UD UD" and "m-cycle" for the general case. Of course, any cycle of arbitrary length has the form "UD UD ... UD" , but the technical aspect behind that approach is powerful enough to disprove the existence of "m-cycles" for m up to 68 (and possibly now a bit more). Simons/deWeger concluded their 2006-work on this with the remark, that despite the remarkable success of this method it cannot be extended much and a complete new idea needs to be introduced to achieve any more progress with this ansatz.

Now despite that this ansatz did not finally lead to success I think it is an answer to your question "what is a good starting point" because first Steiner, later Simons/deWeger believed it were "a good starting point" and looked deeper at the mathematical properties, implications and consequences of when the analysis of the problem was attacked with the according formulae.


To study the Collatz-problem with this ansatz it seems to me, the reformulation of the transformation rule as $$ a_{k+1} = {3a_k +1\over 2^{A_k} } \qquad \qquad \small \text{on odd numbers $a_{k+1}$ and $a_k$ }$$ is much useful. For the introductory discussion I also like it more to replace the indexing of the argument $a_k$ by simply taking the consecutive letters a,b,c..., with z for the last one (if needed) and $N$ the number of transformations and $S$ the sum of exponents $A_k$, so for one transformation I write $ b= {3a+1 \over 2^A} $ or $ c = {3b+1 \over 2^B} $ and so on.

So let's look whether a cycle with one transformation exists. It needed the form $b = (3a+1)/2^A$ and $b=a$ so we could look for solutions $$ a= {3a + 1 \over 2^A} \\ 1= {3 + 1/a \over 2^A} \\ 2^A = 3 + 1/a $$ and we see immediately, that $|a|$ cannot be different from $1$ because otherwise the result were fractional, and also that $a=1$ gives a solution for $A=2$
(and of course $a=-1$, but originally L. Collatz considered only positive $a$ here) .

Now to extend this to the question, whether a cycle of two transformations can exist, we write first $$ b= {3a + 1 \over 2^A} \qquad \qquad a= {3b + 1 \over 2^B} $$ and assuming that a solution exists in $a$ and $b$ we could write their product $$ b \cdot a = {3a + 1 \over 2^A} \cdot {3b + 1 \over 2^B} \\ 2^A \cdot 2^B = \left(3+{ 1 \over a}\right) \cdot \left(3+{ 1 \over b}\right) $$ and we see,
* if a and b assume the smallest possible value 1 we get $$ 2^{A + B} \overset{?}{=} (3+1)(3+1) = 4^2 = 16 $$
which has of course the solution of the trivial cycle as before $A=B=2$.
* If now a and b are greater, let's insert the largest values or even let's look at the limit when they go to infinity we'll have $$ 2^{A + B} = 2^S \overset{?}{=} (3+0 )(3+0) = 3^2 =3^N = 9 $$ Of course, there is no natural A,B allowing this, so if there were any solution at all we needed a perfect power of 2 between $3^2=9$ and $4^2=16$ and this doesn't exist, and so a cycle of $N=2$ transformations does not exist.

We could now proceed and look for a cycle of $N=3$ transformations with different positive odd numbers a,b,c and would find the question, whether there exist a perfect power of 2 between $3^N=27$ and $4^N=64$. Here we find, that there is one, namely $2^5=32$ and so we have to do more.
The equation is simple enough to enumerate possible solutions and disprove it this way; but we still can argue with a certain generality:
- none of a,b,c can be divisible by 3 (because they all result of a transformation $(3x+1)/2^X$),
- they must be odd and
- none can equal 1 (otherwise we had the trivial cycle) , and
- they must all pairwise be different (because otherwise the cycle would be shorter).

Thus the minimal values were $a=5,b=7,c=11$. If we insert them into the product-formula we get $$ 2^5 \overset{?}{=} \left(3+{1 \over 5} \right) \cdot \left(3+{1 \over 7} \right) \cdot \left(3+{1 \over 11} \right) = {16 \cdot 22 \cdot 34 \over 5 \cdot 7 \cdot 11}=32 \cdot { 34 \over 35 } \lt 2^5 $$ and we find two things:

1) the given minimal values for a,b,c allow no solution

2) and because increasing the numbers a,b,c would decrease the rhs there cannot be any more solution, because already with the minimal values attempted, the rhs is smaller than the lhs.

While the type of transformation in the above examples are of the typee "uD", "uD uD" and "uD uD uD" where always one ascending step is followed by one or more descending steps, it shows more interesting structure if we look at the "UD" transformation (or "1-cycle"). One "UD" transformation can be characterized by the number N of "u" steps and the number S for the number of "d" steps plus N. Let for notational convenience write L for N-1 then we get then for $$ b = {3\left({ 3^L \over 2^L}a+{ 3^L - 2^L \over 2^L}\right)+1 \over 2^A } \\ b = {3\left({ 3^L }a+{ 3^L - 2^L }\right)+2^L \over 2^{L+A} } \\ b = {{ 3^N }a+{ 3^N - 3 \cdot 2^L }+2^L \over 2^{L+A} } \\ b = {{ 3^N }a+ ( 3^N - 2^N) \over 2^{N+A-1} } \\ $$ and for the question of cycles we can derive: $$ a \overset{?}{=} {{ 3^N }a+ ( 3^N - 2^N) \over 2^{N+A-1} } \\ \to 2^{N+A-1} =2^S \overset{?}{=} \left( 3^N + { 3^N - 2^N \over a} \right) $$ and for the 1-cycle with $\text{length }(U)=N$ and $\text{length }(D)+N = S$ we get the "critical formula" which must be solvable in natural numbers
$$ a = {3^N-2^N \over 2^S-3^N } \\ \qquad \qquad \text{with $S$ satisfying } 2^{S-1} \lt 3^N \lt 2^S \to S= \lceil N \log(3) / \log(2) \rceil $$

This gives of course the small-number-solution $\small N=1,a=1,S=2$ (the trivial cycle) but also solutions in the negative numbers: for $a$ we find $\small N=1,S=1 \to a=-1$ and $\small N=2,S=3 \to a=-5$ by $\small a={3^2-2^2 \over 2^3-3^2 } = {5 \over -1} =-5$ (Another cycle in the negative numbers is a "2-cycle", including $a=-13$ I think).

This formula is the problem of the 1-cycle which was solved by Ray Steiner and (extended to the m-cycles) by John Simons and Benne de Weger up to $m=68$ or $m=72$. The possibility of existence of solutions depends on the smallest distances of perfect powers $2^S-3^N$ which has not yet a fluent formula and which makes it a challenging attempt on its own.


The reason for why I show here this considerations is that it made me myself stepping into the deeper formalisms of number theory with that powers of 2 and 3 and their modular conditions, approximations etc which I found much interesting for the self study. The first three examples above are of the form "uD" "uDuD" and so on. Looking at "UD", "UDUD" in the fourth example seems even more interesting - and I think, that there is still something more interesting general numbertheoretic structure in that way of calculations - and that's why I propose such an attempt here in trying to answer your question.


Proving this conjecture indirectly would entail two things:

  1. Proving that there is no number n which increases indefinitely

  2. Proving there is no number n which loops indefinitely (besides the 4, 2, 1) loop

If one does these things then you have an answer to the collatz conjecture (and if you find a case of either of these things you have disproven the collatz conjecture obviously)

Of course this is just one approach that comes to mind, there are other possible methods which are beyond my own knowledge