Cayley-Hamilton revisited

Yes, this follows from known facts on matrix polynomials. There is a full characterization of spectral divisors of matrix polynomials in Gohberg, Lancaster, Rodman, Matrix Polynomials. They treat monic polynomials (i.e., $A_k=1$), but this is not a restriction, since one can make a Möbius transform to enforce it unless $g(\lambda)\equiv 0$.

They introduce so-called standard pairs of a matrix polynomial $A(\lambda)$, which are in some sense a generalization of the companion matrix. Then they prove (Thm 3.12) that if a polynomial $Q(\lambda)$ is a right divisor of $P(\lambda)$ then the standard pair of $Q$ is a restriction of that of $P$; translating it to companion matrices, it would mean that the companion matrix of $Q$ can be obtained as a restriction of that of $P$ to an invariant subspace.

We apply that to $\lambda I - B$ (for which the "companion matrix" is $B$ itself) and $P(\lambda)=\sum_{i=0}^k A_i \lambda^i$. In particular, this means that the Jordan structure of $B$ is a substructure of that of the companion matrix of $P(\lambda)$, hence the algebraic multiplicities of the eigenvalues of $B$ are less or equal than those of $P(\lambda)$, that is, the characteristic polynomial of $B$ is a divisor of your $g(\lambda)$.

Not sure how much of this is understandable -- I must admit that book has not the reputation for being an easy-to-read one in the community. A more self-contained approach to these topics is in: I. Gohberg, M.A. Kaashoek and P. Lancaster, General theory of regular matrix polynomials and band Toeplitz operators.


Am I missing something or is Ilya Bogdanov's elimination of $A_0$ trick more or less a proof in itself?

Assume that $f\left(B\right) = 0_n$. Then, $0_n = f\left(B\right) = A_kB^k + A_{k-1}B^{k-1} + ... + A_0 = \sum\limits_{i=0}^k A_iB^i$. But

$\lambda^k A_k + \lambda^{k-1}A_{k-1} + ... + A_0 = \sum\limits_{i=0}^k \lambda^i A_i = \sum\limits_{i=0}^k \lambda^i A_i - \sum\limits_{i=0}^k A_iB^i$ (since $0 = \sum\limits_{i=0}^k A_iB^i$)

$= \sum\limits_{i=0}^k A_i \left(\lambda^i-B^i\right)$.

This polynomial is divisible by $\lambda-B$ on the right (because $\lambda^i-B^i$ is divisible by $\lambda-B$ for every $i$). Hence,

$\det\left(\lambda^k A_k + \lambda^{k-1}A_{k-1} + ... + A_0\right)$ is divisible by $\det\left(\lambda-B\right)$.

In other words, $g\left(\lambda\right)$ is divisible by $\det\left(\lambda-B\right)$ (since $\det\left(\lambda^k A_k + \lambda^{k-1}A_{k-1} + ... + A_0\right) = g\left(\lambda\right)$). Since $B$ is a root of the polynomial $\det\left(\lambda-B\right)$ (by the usual Cayley-Hamilton theorem), this yields that $B$ is a root of $g\left(\lambda\right)$, so that $g\left(B\right) = 0$, and we are done.

I agree with Yazdegerd III that the characteristic-$0$ assumption shouldn't be there. Even if my proof would use it, Ilya's observation that the result is a polynomial identity in the entries of $A_k$, $A_{k-1}$, ..., $A_1$ and $B$ should make it clear that it holds over any commutative ring.


To convince people that this is true, let me add a quick proof that works in the special case in which $B$ is diagonalizable with distinct eigenvalues.

By changing bases, we can assume it is in fact diagonal and equal to $\operatorname{diag}(\lambda_1,\lambda_2,\dots,\lambda_n)$. We have $0=\sum A_i B^i e_j=\sum A_i\lambda_j^i e_j$ for the $j$-th vector of the canonical basis $e_j$. Hence for each $\lambda_j$ the matrix $\sum A_i \lambda_j^i$ is singular, so the $\lambda_j$ are all roots of $g(\lambda)=\det p(\lambda)$. In particular, this means that $g(B)=0$.

More care is necessary if $B$ has multiple eigenvalues and Jordan blocks, of course.