Expressing $-\operatorname{adj}(A)$ as a polynomial in $A$?

Here is a direct proof along the lines of the standard proof of the Cayley–Hamilton theorem. [This works universally, i.e. over the commutative ring $R=\mathbb{Z}[a_{ij}]$ generated by the entries of a generic matrix $A$.]

The following lemma combining Abel's summation and Bezout's polynomial remainder theorem is immediate.

Lemma Let $A(\lambda)$ and $B(\lambda)$ be matrix polynomials over a (noncommutative) ring $S.$ Then $A(\lambda)B(\lambda)-A(0)B(0)=\lambda q(\lambda)$ for a polynomial $q(\lambda)\in S[\lambda]$ that can be expressed as

$$q(\lambda)=A(\lambda)\frac{B(\lambda)-B(0)}{\lambda}+\frac{A(\lambda)-A(0)}{\lambda}B(0)=A(\lambda)b(\lambda)+a(\lambda)B(0) \qquad (*)$$

with $a(\lambda),b(\lambda)\in S[\lambda].$


Let $A(\lambda)=A-\lambda I_n$ and $B(\lambda)=\operatorname{adj} A(\lambda)$ [viewed as elements of $S[\lambda]$ with $S=M_n(R)$], then

$$A(\lambda)B(\lambda)=\det A(\lambda)=p_A(\lambda)=p_0+p_1\lambda+\ldots+p_n\lambda^n$$ is the characteristic polynomial of $A$ and $$A(0)B(0)=p_0 \text{ and } q(\lambda)=p_1+\ldots+p_n\lambda^{n-1}$$

Applying $(*),$ we get

$$q(\lambda)=(A-\lambda I)b(\lambda)-\operatorname{adj} A \qquad (**) $$

for some matrix polynomial $b(\lambda)$ commuting with $A.$ Specializing $\lambda$ to $A$ in $(**),$ we conclude that

$$q(A)=-\operatorname{adj} A\qquad \square$$


HINT $\;$ Work "generically", i.e. let the entries $\;\rm a_{i,j}$ of $\rm A\;$ be indeterminates and work in the matrix ring $\rm M = M_n(R)\;$ over $\;\rm R = {\mathbb Z}[a_{i,j}\:]. \;$ We wish to prove $\rm B = C$ from $\rm d\: B = d\: C$ for $\rm d = det\: A \in R, \;\; B,C \in M.$ But this is equivalent to $\rm d\: b_{i,j} = d\: c_{i,j}$ in the domain $\rm R = {\mathbb Z}[a_{i,j}\:]$ where $\;\rm d = det\: A \ne 0$, so $\rm d$ is cancelable, yielding $\;\rm b_{i,j} = c_{i,j}\;$ hence $\rm B = C$. This identity remains true over every commutative ring $\rm S$ since, by the universality of polynomial rings, there exists an eval homomorphism that evaluates $\;\rm a_{i,j}\;$ at any $\;\rm s_{i,j}\in S$.

Notice that the crucial insight is that $\;\rm b_{i,j}\:, \; c_{i,j}\:,\; d\;$ have polynomial form in $\;\rm a_{i,j}\:$, i.e. they are elts of the polynomial ring $\;\rm R = {\mathbb Z}[a_{i,j}\:] = {\mathbb Z}[a_{1,1},\cdots,a_{n,n}\:]$ which, being a domain, enjoys cancelation of elts $\ne 0$. Working generically allows us to cancel $\rm d$ and deduce the identity before any evaluation where $\rm d\mapsto 0.$

Such proofs by way of universal polynomial identities emphasize the power of the abstraction of a formal polynomial (vs. polynomial function). Alas, many algebra textbooks fail to explicitly emphasize this universal viewpoint. As a result, many students cannot easily resist the obvious topological temptations and instead derive hairier proofs employing density arguments (e.g see elswhere in this thread).

Analogously, the same generic method of proof works for many other polynomial identities, e.g.

$\rm\quad\; det(I-AB) = det(I-BA)\;\:$ by taking $\;\rm det\;$ of $\;\;\rm (I-AB)\;A = A\;(I-BA)\;$ then canceling $\;\rm det \:A$

$\rm\quad\quad det(adj \:A) = (det \:A)^{n-1}\quad$ by taking $\;\rm det\;$ of $\;\rm\quad A\;(adj\: A) = (det\: A) \;I\quad\;\;$ then canceling $\;\rm det \:A$

Now, for our pièce de résistance of topology, we derive the polynomial derivative purely formally.

For $\rm f(x) \in R[x]$ define $\rm D f(x) = f_0(x,x)$ where $\rm f_0(x,y) = \frac{f(x)-f(y)}{x-y}.$ Note that the existence and uniqueness of this derivative follows from the Factor Theorem, i.e. $\;\rm x-y \; | \; f(x)-f(y)\;$ in $\;\rm R[x,y],\;$ and, from the cancelation law $\;\rm (x-y) g = (x-y) h \implies g = h$ for $\rm g,h \in R[x,y].$ It's clear this agrees on polynomials with the analytic derivative definition since it is linear and it takes the same value on the basis monomials $\rm x^n$. Resisting limits again, we get the product rule rule for derivatives from the trivial difference product rule

$$ \rm f(x)g(x) - f(y)g(y)\; = \;(f(x)-f(y)) g(x) + f(y) (g(x)-g(y))$$

$\quad\quad\quad\quad\rm\quad\quad\quad \Longrightarrow \quad\quad\quad\quad\quad\; D(fg)\quad = \quad (Df) \; g \; + \; f \; (Dg) $

by canceling $\rm x-y$ in the first equation, then evaluating at $\rm y = x$, i.e. specialize the difference "quotient" from the product rule for differences. Here the formal cancelation of the factor $\;\rm x-y\;$ before evaluation at $\;\rm y = x\;$ is precisely analogous to the formal cancelation of $\;\rm det \:A\;$ in all of the examples given above.


I guess it is worth giving a fuller answer, and then Victor can tell me more precisely where I am missing some subtlety. As I said, the definition I know of the adjugate is that it is a matrix whose entries are polynomials in the entries $a_{ij}$ of $A$ and which satisfies $A \text{ adj}(A) = I \det A$ identically, e.g. over $\mathbb{Z}[a_{ij}]$. Assuming Cayley-Hamilton, we know that $p_0 I + p_1 A + ... + p_n A^n = 0$ identically and that $p_0 = \det A$, where $p_k \in \mathbb{Z}[a_{ij}]$ as well.

Specializing now to $a_{ij} \in \mathbb{C}$ and supposing that $A$ is invertible, we conclude that

$$A \text{ adj}(A) = - p_1 A - p_2 A^2 - ... - p_n A^n$$

implies

$$\text{adj}(A) = - p_1 I - p_2 A - ... - p_n A^{n-1},$$

as you say.

Lemma: The invertible $n \times n$ matrices are dense in the $n \times n$ matrices with the operator norm topology.

Proof. Let $A$ be a non-invertible $n \times n$ matrix, hence $\det A = 0$. The polynomial $\det(A - xI)$ has leading term $(-1)^n x^n$, hence cannot be identically zero, so in any neighborhood of $A$ there exists $x$ such that $A - xI$ is invertible.

But everything in sight is continuous in the operator norm topology, so the conclusion follows identically over $\mathbb{C}$ and hence identically.

(I should mention that this is not even my preferred method of proving matrix identities. Whenever possible, I try to prove them combinatorially by interpreting $A$ as the adjacency matrix of some graph. For example - confession time! - this is how I think about Cayley-Hamilton. This is far from the cleanest or the shortest way to do things, but my combinatorial intuition is better than my algebraic intuition and I think it's good to have as many different proofs of the basics as possible.)