Frobenius normal form of a doubly stochastic matrix

The matrix $P^\top AP$ is doubly stochastic. Suppose that $A_{11}$ is $m_{1} \times m_{1}.$ Note that the sum of the entries of each column of $A_{11}$ is $1$,and the sum of entries in each row of $A_{11}$ is at most $1$ (using non-negativity of all entries of $P^\top AP)$. Notice that the sum of all entries of $A_{11}$ is $m_{1}$. Hence, the sum of the entries of each row of $A_{11}$ is in fact $1$ (if any of the row sums were strictly less than $1$, then the sum of the row-sums, which equals the sum of the entries of $A_{11}$, would be strictly less than $m_1$, a contradiction), i.e., $A_{11}$ is doubly stochastic. Thus, $A_{12},A_{13},\ldots,A_{1k}$ are all zero. Likewise, $A_{22}$ must be doubly stochastic and $A_{23}, \ldots , A_{2k}$ must all be zero, and so on.


Here is a proof of the fact claimed by Perfect and Mirsky.

Proof. Let $A$ be doubly stochastic and let $B = P^TAP$ be a Frobenius normal form of $A$, which is given as in the question. Then $B$ is doubly stochastic, too. Fix $j \in \{2, \dots, k\}$. We show that all entries of $B$ that are located above the block $A_{jj}$ are $0$.

Let $i, i+1,\dots,i+m-1$ denote those $m$ subsequent indices in $\{1,\dots,n\}$ where the block matrix $A_{jj}$ is located in $B$.

We need the following four objects:

  • $I = \{x \in \mathbb{C}^n: \, x_\ell = 0 \text{ for } \ell \ge i \}$.

  • $J = \{x \in \mathbb{C}^n: \, x_\ell = 0 \text{ for } \ell < i \}$.

  • $e_I = (1,\dots,1,0,\dots,0)$, where the ones are located exactly at the positions $1,\dots,i-1$ (i.e. $e_I \in I$ and $I$ is the so-called ideal generated by $e_I$ in the lattice sense).

  • $e_J = (0,\dots,0,1,\dots,1)$, where the ones are located exactly at the positions $i,\dots,n$ (i.e. $e_J \in J$ and $J$ is the ideal generated by $e_J$).

Now we make a couple of observations:

(a) $I$ is invariant under $B$, as follows from the special form of $B$. This implies that $Be_I \in I$. On the other hand we have $Be_I \le Be = e$, so we conclude that actually $Be_I \le e_I$.

(b) Now we observe that $e^T Be_I = e^T e_I$, which implies that no component of $Be_I$ can by strictly smaller than the corresponding component of $e_I$; so we actually have $Be_I = e_I$.

(c) Since $Be = e$, the equality $Be_I = e_I$ readily implies that we also have $Be_J = e_J$. Consequently, $B$ also leaves $J$ invariant. This in turn implies that all entries of $B$ above of the block $A_{jj}$ are $0$.

Remark. I think what makes the above proof maybe a bit counterintuitive is the handling of all the indices. The essence of the proof (and, one might argue, also of the result quoted from Perfect and Mirsky) is that if, under the given assumptions, $A$ leaves an ideal $I$ invariant, then it also leaves the orthogonal ideal of $I$ invariant (where "orthogonal" in meant the lattice sense). By the way, this abstract version of the result generalizes very nicely to the infinite dimensional setting.