Diagonalization via the Toda flow

$\def\Tr{\mathrm{Tr}}$This proof is short enough that I thought I'd just write it out. On a skim, this looks like the same proof that Christian Remling pointed you to, and which Deift-Li-Tomei say is the same as the proof of Moser.

Disclaimer: all signs in this argument have at best a 55% chance of being right.

First of all, let $C$ be any continuous function at all from $n \times n$ matrices to $n \times n$ matrices and define $Y(t)$ by the ODE $$\frac{dY}{dt} = C(Y) Y - Y C(Y).$$ Then $$\frac{d \Tr(Y^m)}{dt} = \Tr \left( \frac{dY}{dt} Y^{m-1} + Y \frac{dY}{dt} Y^{m-2} + \cdots + Y^{m-1} \frac{dY}{dt} \right)$$ $$=\Tr \left( C(Y) Y^m - Y C(Y) Y^{m-1} + Y C(Y) Y^{m-1} - Y^2 C(Y) Y^{m-2} + \cdots + Y^{m-1} C(Y) Y - Y^m C(Y) \right) = \Tr\left( C(Y) Y^m - Y^m C(Y) \right)=0.$$ So $\Tr(Y^m)$ is constant and all the $Y(t)$'s have the same spectrum.

Also, if $Y$ is symmetric and $C(Y)$ is skew-symmetric, then $C(Y) Y - Y C(Y)$ is symmetric, so symmetric matrices stay symmetric.

Now, we specialize to the case of Toda flow. Let $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n$ be the spectrum of $X$. A quick computation shows that $$\frac{d X_{kk}}{dt} = - \sum_{i< k} X_{ik}^2 + \sum_{i>k} X_{ik}^2.$$ So $$\frac{d (X_{11}+X_{22} + \cdots + X_{kk})}{dt} = \sum_{i \leq k,\ j > k} X_{ij}^2.$$ So all the quantities $X_{11}+X_{22} + \cdots + X_{kk}$ are increasing.

Since $X$ is symmetric we have $X_{ii} \leq \lambda_1$ (an inequality of Schur), so $X_{11} + \cdots + X_{kk}$ is bounded above and we conclude that $\lim_{t \to \infty} X_{11} + \cdots + X_{kk}$ exists. As a result, $\lim_{t \to \infty} X_{kk}$ exists, call it $\mu_k$.

Also, we see that $\lim_{t \to \infty} \sum_{i \leq k,\ j > k} X_{ij}^2 =0$ and we thus deduce that $\lim_{t \to \infty} X_{ij} =0$ for each $i \neq j$. So $\lim_{t \to \infty} X$ is a diagonal matrix, with diagonal entries $\mu_i$, and the same spectrum as $X$. So the $\mu$'s are a permutation of the $\lambda$'s.

Finally, we want to know in what order the $\lambda$'s occur. We can't answer this in general: all the diagonal matrices are fixed points of the flow. However, I claim that $\mu_1 \geq \mu_2 \geq \cdots \geq \mu_n$ is the only stable fixed point. Proof: If $\mu_i < \mu_{i+1}$, then a tiny perturbation in direction $e_{i,i+1} + e_{i+1, i}$ is magnified, where $e_{i,j}$ is the matrix whose unique nonzero entry is a $1$ in position $(i,j)$. So almost all matrices flow to $\mu_1 \geq \mu_2 \geq \cdots \geq \mu_n$.


On Toda flow and Morse flow The exact same proof works if $$B(X)_{ij} = c_{ij} X_{ij}$$ for any skew symmetric matrix $c$ with positive entries above the diagonal. In another answer, I work out that the Morse flow for the function $\psi(X) = \sum a_i X_{ii}$ is given by this equation with $c_{ij} = a_i - a_j$. (The metric on the set of matrices with fixed spectrum is induced by the $SO(n)$ action, and the inner product on $\mathfrak{so}(n)$ is the standard one.) So Toda flow would be Morse flow if we could arrange that $a_i -a_j = 1$ for all $i<j$. This is possible for tridiagonal matrices (a very cool lemma is that Toda flow preserves the property of having $X_{ij} = 0$ for $|i-j|>k$), but not in general. Still, I can imagine a fake history where Toda flow was discovered by writing down Morse flow for $\psi$ and then noticing that it still worked for any $c_{ij}$.


A rather readable reference for this is Deift, Li, Tomei, Toda flows with infinitely many variables, JFA 64 (1985), 358-402 (who attribute the result to Moser).