How does one obtain the Jordan normal form of a matrix $A$ by studying $XI-A$?

The method used in the lecture notes is based $\def\KX{K[X]}$on the structure theorem for finitely generated $\KX$-modules (more generally f.g. modules over a PID) rather than on purely linear algebraic considerations. What the given computation shows is that the $\KX$-module, defined by having $X$ act on $K^4$ by the matrix$~A$, is isomorphic to $\KX/\langle1\rangle\oplus\KX/\langle1\rangle\oplus\KX/\langle X-1\rangle\oplus\KX/\langle(X-2)(X-1)^2\rangle$ (the first two summands are trivial and can be omitted). The final factor can be further decomposed as $\KX/\langle(X-2)(X-1)^2\rangle\cong\KX/\langle X-2\rangle\oplus\KX/\langle(X-1)^2\rangle$ because of a theorem for which I am still wondering if it has a name in English (in French it is the lemme des noyaux, it is related to the Chinese remainder theorem, but not the same as it is about modules rather than rings).

So our $\KX$-module is isomorphic to $\KX/\langle X-1\rangle\oplus\KX/\langle(X-1)^2\rangle\oplus\KX/\langle X-2\rangle$, where I have moved to the front the two factors that contain eigenvectors for the eigenvalue$~1$. The outermost summands are of dimension$~1$, and $X$ acts as a scalar $1$ respectively $2$ on each of them; in the middle factor of dimension$~2$, the action of $X$ has eigenvalue$~1$ but is not diagonalisable: it is a Jordan block of size$~2$. So this is how one gets two Jordan blocks of sizes $1,2$ for $\lambda=1$ and a single Jordan block of size$~1$ for $\lambda=2$.

I'll add a word about the initial computation leading to the diagonal form. The algorithm applied is that of computing the Smith normal form of the matrix $XI-A$ over $\KX$ (which is a PID), using row and column operations with scalars in $\KX$ to arrive at a diagonal form in which successive diagonal entries each divide the next one. I am not sure whether your lecture notes explain why one should start with $XI-A$, and why the result describes a cyclic decomposition of the $\KX$-module defined by$~A$, so here it is. Intermediate matrices describe a presentation of a $\KX$-module as quotient of the free module $\KX^n$, where $n$ is the number of rows, by the sub-module $N$ generated by the columns of the matrix. During the algorithm, column operations correspond to changing to different generators of $N$, while row operations correspond to choosing a different basis of $\KX^n$ than the standard one, and changing coordinates (of the generators of$~N$) to the new basis. At the end of the transformation one arrives at a situation in which all generators of$~N$ are (polynomial) multiples of the basis of $\KX^n$ used, whence the diagonal form, and this describes the module as a direct sum of cyclic sub-modules.

So why does $XI-A$ describe the $\KX$-module$~M$ defined on $K^n$ by the action of$~A$? Because the standard basis $(e_1,\ldots,e_n)$ of $K^n$ is certainly a set of generators for$~M$ (probably quite redundant) so we have a surjective morphism $f:\KX^n\to M$, and column$~j$ of $XI-A$ describes $Xe_j-\sum_i A_{i,j}e_i\in\KX^n$ which element, given that $X$ acts on $M$ as $A$, is in the kernel $N$ of $f$ by definition; it is not hard to see that these elements generate all of$~N$, so that they give a complete presentation of$~M$.

Whether this method is a very practical manner to find a Jordan normal form is questionable; in my experience performing the Smith normal form algorithm over $\KX$ is extremely tedious an error prone by hand. However, apart from the theoretical importance, it is certainly an algorithm, while I think methods of finding the Jordan normal form by linear algebra only tend to be hard to describe as a complete algorithm (in practice many shortcuts are possible, but a complete method covering all possibilities is long to describe). I've tried to describe a linear algebraic algorithm (as well as the one above) for the somewhat coarser Rational Canonical (or Frobenius Normal) Form in this answer.


$A$ has eigenvalue $\lambda_1=2$ of multiplicity $1$ and $\lambda_2=1$ of multiplicty $3$. Apparently, the Jordan block corresponding to $2$ is a $1\times 1$ matrix and thus uniteresting for us.

Next, we study $\text{rank} (A-I)$. It's easy to check that the rank is equal to $2$, so we conclude that $A$ has $4-2=2$ linearly independent eigenvectors corresponding to eigenvalue $1$. It yields the structure of Jordan blocks: one is $1\times 1$ matrix and the other one is a $2\times 2$ (other possibilities are ruled out: three $1\times 1$ matrices would require 3 eigenvectors and one $3\times 3$ would require only one eigenvector). Thus, we conclude that the Jordan form is indeed $$J=\begin{bmatrix}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 1 & 1 & 0\\0 & 0 & 0 & 2\end{bmatrix}.$$

In more general case, the algorithm is a little bit more complex: you study eigenvalues, then eigenvectors, and then generalized eigenvectors.