Should the formula for the inverse of a 2x2 matrix be obvious?

Think about $\left({\phantom-d\phantom--b\atop-c\phantom{--}a}\right)$ as $tI - A$ where $t=a+d$ is the trace of $A$. Since $A$ satisfies its own characteristic equation (Cayley-Hamilton), we have $A^2 - t A + \Delta \cdot I = 0$ where $\Delta = ad-bc$ is the determinant. Thus $\Delta \cdot I = t A - A^2$. Now divide both sides by $\Delta \cdot A$ to get $A^{-1} = \Delta^{-1}(tI-A)$, QED.


EDIT (8/14/2020): A couple people have suggested that this answer should come with a warning -- this is a pretty fancy approach to an elementary question, motivated by the fact that I know the OP's interests. Some of the other answers below are probably better if you just want to invert some matrices :). I've also fixed a couple of minor typos.


My favorite way to remember this is to think of $SL_2(\mathbb{R})$ as a circle bundle over the upper half-plane, where $SL_2(\mathbb{R})$ acts on the upper half-plane via fractional linear transformations; then the map sends an element of $SL_2(\mathbb{R})$ to the image of $i$ under the corresponding fractional linear transformation. The fiber over a point is the corresponding coset of the stabilizer of $i$.

This naturally gives the Iwasawa decomposition of $SL_2(\mathbb{R})$ as $$SL_2(\mathbb{R})=NAK$$ where

$$K=\left\{\begin{pmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{pmatrix} , ~0\leq\theta<2\pi \right\}$$

$$A=\left\{\begin{pmatrix} r & 0\\ 0 &1/r\end{pmatrix},~ r\in \mathbb{R}\setminus\{0\}\right\}$$

$$N=\left\{\begin{pmatrix} 1 & x \\ 0 & 1\end{pmatrix},~ x\in \mathbb{R}\right\}$$

Here $K$ is the stabilizer of $i$ in the upper half-plane picture; viewed as acting on the plane via the usual action of $SL_2(\mathbb{R})$ on $\mathbb{R}^2$ it is just rotation by $\theta$ (and likewise if we view the upper half plane as the unit disk, sending $i$ to $0$ via a fractional linear transformation). $A$ is just scaling by $r^2$, in the upper half-plane picture, and is stretching in the $\mathbb{R}^2$ picture. $N$ is translation by $x$ in the upper half-plane picture, and is a skew transformation in the $\mathbb{R}^2$ picture.

In each case, the inverse is geometrically obvious: for $K$, replace $\theta$ with $-\theta$; for $A$ replace $r$ with $1/r$, and for $N$, replace $x$ with $-x$. Since $$SL_2(\mathbb{R})=NAK$$ this lets us invert every $2\times 2$ matrix by "pure thought", at least if you remember the Iwasawa decomposition (which is easy from the geometric picture, I think). Of course this easily extends to $GL_2$; if $A$ has determinant $d$, then $A^{-1}$ had better have determinant $d^{-1}$.

If you'd like to derive the formula you've written down by "pure thought" it suffices to look at any one of these cases if you remember the general form of the inverse; or you can simply put them all together to give a rigorous derivation.


Recall that the adjugate $\text{adj}(A)$ of a square matrix is a matrix that satisfies $$A \cdot \text{adj}(A) = \text{adj}(A) \cdot A = \det(A).$$

Like the determinant, the adjugate is multiplicative. Categorically, the reason the determinant is multiplicative is that it comes from a functor (the exterior power), so one might expect that the adjugate also comes from a functor, and indeed it does (the same functor!).

More precisely, let $T : V \to V$ be a linear transformation on a finite-dimensional vector space with basis $e_1, ... e_n$. Then the adjugate of the matrix of $T$ with respect to the basis $e_i$ is the matrix of $\Lambda^{n-1}(T) : \Lambda^{n-1}(V) \to \Lambda^{n-1}(V)$ with respect to an appropriate "dual basis" $$(-1)^{i-1} \bigwedge_{j \neq i} e_j$$ of $\Lambda^{n-1}(V)$ (it becomes an actual dual basis if you identify $\Lambda^n(V)$ with the underlying field $k$ by sending $e_1 \wedge ... \wedge e_n$ to $1$). The exterior product $V \times \Lambda^{n-1}(V) \to \Lambda^n(V)$ can then be identified with the dual pairing $V \times V^{\ast} \to k$, and the action of the exterior product on endomorphisms of $V$ and $\Lambda^{n-1}(V)$ can be identified with the composition of endomorphisms of $V$ (remembering that $\text{End}(V)$ is canonically isomorphic to $\text{End}(V^{\ast})$). This categorifies the above statement.

When $n = 2$, the dual basis is $e_2, - e_1$ but $\Lambda^1$ is the identity functor, and the formula follows. The geometric intuition comes from thinking about the exterior product in terms of oriented areas of parallelograms in $\mathbb{R}^2$.