Why is the 'change-of-basis matrix' called such?

The situation here is closely related to the following situation: say you have some real function $f(x)$ and you want to shift its graph to the right by a positive constant $a$. Then the correct thing to do to the function is to shift $x$ over to the left; that is, the new function is $f(x - a)$. In essence you have shifted the graph to the right by shifting the coordinate axes to the left.

In this situation, if you have a vector $v$ expressed in some basis $e_1, ... e_n$, and you want to express it in a new basis $Pe_1, .... Pe_n$ (this is why $P$ is called the change of basis matrix), then you multiply the numerical vector $v$ by $P^{-1}$ in order to do this. You should carefully work through some numerical examples to convince yourself that this is correct. Consider, for example, the simple case that $P$ is multiplication by a scalar.

The lesson here is that one must carefully distinguish between vectors and the components used to express a vector in a particular basis. Vectors transform covariantly, but their components transform contravariantly.


Everybody studying the change of basis affair should work out some simple examples like the following. Consider this basis in $\mathbb{R}^2$:

$$ v_1 = (1,1) \qquad \text{and} \qquad v_2 = (1,-1) \ . $$

Or, since we are going to stress the bases and coordinates thing, we could write it this way

$$ v_1 = (1,1)_e \qquad \text{and} \qquad v_2 = (1,-1)_e \ , $$

since these are coordinates in the standard basis

$$ e_1 = (1,0) \qquad \text{and} \qquad e_2 = (0,1) \ . $$

The change of basis matrix from $v$ to $e$ is

$$ P = \begin{pmatrix} 1 & 1 \\\ 1 & -1 \end{pmatrix} \ . $$

Now, take the vector

$$ u = 2v_1 - 3v_2 \ . $$

Its coordinates in the $v$ basis are:

$$ u = (2,-3)_v \ . $$

If you want to obtain its coordinates in the $e$ (standard) basis, you can do it by hand:

$$ u = 2v_1 - 3v_2 = 2(1,1)_e -3(1,-1)_e = (2-3, 2+3)_e = (-1, 5)_e \ . $$

Now, you realise that these are exactly the same operations that you do when performing this matrix multiplication:

$$ P \begin{pmatrix} 2 \\\ -3 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\\ 1 & -1 \end{pmatrix} \begin{pmatrix} 2 \\\ -3 \end{pmatrix} = \begin{pmatrix} 2 - 3 \\\ 2 + 3 \end{pmatrix} = \begin{pmatrix} -1 \\\ 5 \end{pmatrix} \ . $$

Exercise. Maybe now you could redo yourself the proof of the change of basis theorem: take two arbitrary bases $v$ and $e$ in no matter which vector space, related by

$$ v_i = a^1_i e_1 + \cdots + a^n_i e_n \ , \qquad i = 1, \dots , n \ . $$

Write down the change of basis matrix from $v$ to $e$ (that is, put the coordinates of the $v$ vectors as columns, like in the previous example):

$$ P = \begin{pmatrix} a^1_1 & \dots & a^1_n \\\ \vdots & \ddots & \vdots \\\ a^n_1 & \dots & a^n_n \end{pmatrix} \ , $$

take any vector

$$ u = b^1v_1 + \cdots + b^nv_n \ , $$

and write down its coordinates in the $v$ basis. Finally, find out its coordinates in the $e$ basis (by hand and with the help of the matrix $P$).


One major reason is practical. The matrix that converts vectors in the new coordinates into the old coordinates is easy to come by: you just put your new basis vectors as columns of the matrix.

Then to find the matrix going the other way around, you have to compute the inverse of this matrix.

Thus, it makes sense to call the first one $P$, and the second one $P^{-1}$.