Is a linear transformation onto or one-to-one?

"One-to-one" and "onto" are properties of functions in general, not just linear transformations.

Definition. Let $f\colon X\to Y$ be a function.

  • $f$ is one-to-one if and only if for every $y\in Y$ there is at most one $x\in X$ such that $f(x)=y$; equivalently, if and only if $f(x_1)=f(x_2)$ implies $x_1=x_2$.
  • $f$ is onto (or onto $Y$, if the codomain is not clear from context) if and only if for every $y\in Y$ there at least one $x\in X$ such that $f(x)=y$.

This definition applies to linear transformations as well, and in particular for linear transformations $T\colon \mathbb{R}^n\to\mathbb{R}^m$, and by extension to matrices, since an $m\times n$ matrix $A$ can be identified with the linear transformation $L_A\colon\mathbb{R}^n\to\mathbb{R}^m$ given by $L_A(\mathbf{x}) = A\mathbf{x}$.

So, the definitions are for any functions. But when our sets $X$ and $Y$ have more structure to them, and the functions are not arbitrary, but special kinds of functions, we can often obtain other ways of characterizing a function as one-to-one or onto which is easier/better/more useful/more conceptual/has interesting applications. This is indeed the case when we have such a rich structure as linear transformations and vector spaces.

One-to-one is probably the easiest; this is because whether a function is one-to-one depends only on its domain, and not on its codomain. By contrast, whether a function is onto depends on both on the domain and the codomain (so, for instance, $f(x)=x^2$ is onto if we think of it as a function $f\colon\mathbb{R}\to[0,\infty)$, but not if we think of it as a function $f\colon\mathbb{R}\to\mathbb{R}$, or $f\colon[2,\infty)\to[0,\infty)$).

Theorem. Let $T\colon\mathbb{R}^n\to\mathbb{R}^m$ be a linear transformation. The following are equivalent:

  1. $T$ is one-to-one.
  2. $T(\mathbf{x})=\mathbf{0}$ has only the trivial solution $\mathbf{x}=\mathbf{0}$.
  3. If $A$ is the standard matrix of $T$, then the columns of $A$ are linearly independent.

Proof. The equivalence of (1) and (2) is basic in linear algebra, so let's deal with that:

(1)$\Rightarrow$(2): If $T$ is one-to-one, then for all $\mathbf{x}$, since $T(\mathbf{0})=\mathbf{0}$ (being linear), then $T(\mathbf{x})=\mathbf{0}=T(\mathbf{0})$ implies $\mathbf{x}=\mathbf{0}$; this proves (2).

(2)$\Rightarrow$(1): Suppose $T(\mathbf{x}_1)=T(\mathbf{x}_2)$. Then $$\mathbf{0} = T(\mathbf{x}_1) - T(\mathbf{x}_2) = T(\mathbf{x}_1-\mathbf{x}_2),$$ since $T$ is linear; because we are assuming (2), $T(\mathbf{x}_1-\mathbf{x}_2)=\mathbf{0}$ implies that $\mathbf{x}_1-\mathbf{x}_2 = \mathbf{0}$, so $\mathbf{x}_1=\mathbf{x}_2$, which proves that $T$ is indeed one-to-one.

The key to the connection with (3) (and eventually to your confusion) is that multiplying a matrix by a vector can be seen as an operation on columns. If $$A=\left(\begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{array}\right),$$ then let the columns of $A$, $A_1$, $A_2,\ldots,A_n$ be: $$A_1 = \left(\begin{array}{c}a_{11}\\a_{21}\\ \vdots\\a_{m1}\end{array}\right),\quad A_2 = \left(\begin{array}{c}a_{12}\\a_{22}\\ \vdots\\ a_{m2}\end{array}\right),\quad\ldots,A_n = \left(\begin{array}{c}a_{1n}\\a_{2n}\\ \vdots\\ a_{mn}\end{array}\right).$$ Then we have the following: $$A\left(\begin{array}{c}x_1\\x_2\\ \vdots\\x_n\end{array}\right) = x_1 A_1 + a_2 A_2 + \cdots + a_nA_n.$$ That is, multiplying $A$ by $\mathbf{x}$ gives a linear combination of the columns of $A$. This gives the direct connection we need between conditions (1) and (2), and condition (3).

(2)$\Rightarrow$(3): To show that the columns of $A$ are linearly independent, we need to show that if $\alpha_1A_1 + \cdots + \alpha_nA_n = \mathbf{0}$, then $\alpha_1=\cdots=\alpha_n=0$. So suppose $\alpha_1A_1+\cdots+\alpha_nA_n = \mathbf{0}$. Then $$T(\mathbf{\alpha}) = A\mathbf{\alpha} = \alpha_1A_1+\cdots+\alpha_nA_n = \mathbf{0},\qquad\text{where }\mathbf{\alpha}=\left(\begin{array}{c}\alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n\end{array}\right).$$ Because we are assuming (2), that means that from $T(\mathbf{\alpha}) = \mathbf{0}$ we can conclude that $\alpha=\mathbf{0}$; therefore, $\alpha_1=\cdots=\alpha_n=0$. This proves that $A_1,\ldots,A_n$ are linearly independent.

(3)$\Rightarrow$(2): Suppose the columns of $A$ are linearly independent, and $$\mathbf{0} = T(\mathbf{x}) = A\mathbf{x}\quad\text{where }\mathbf{x}=\left(\begin{array}{c}a_1\\a_2\\ \vdots\\ a_n\end{array}\right).$$ This means that $a_1A_1 + \cdots a_nA_n = \mathbf{0}$; since the columns of $A$ are assumed to be linearly independent, we conclude that $a_1=\cdots=a_n=0$, so $\mathbf{x}=\mathbf{0}$, proving (2). QED

What about onto? There are two things here. One is a theorem similar to the one above; the other is the Rank-Nullity Theorem.

Theorem. Let $T\colon\mathbb{R}^n\to\mathbb{R}^m$ be a linear transformation. The following are equivalent:

  1. $T$ is onto.
  2. The equation $T(\mathbf{x})=\mathbf{b}$ has solutions for every $\mathbf{b}\in\mathbb{R}^m$.
  3. If $A$ is the standard matrix of $T$, then the columns of $A$ span $\mathbb{R}^m$. That is: every $\mathbf{b}\in\mathbb{R}^m$ is a linear combination of the columns of $A$.

Proof. (1)$\Leftrightarrow$(2) is essentially the definition, only cast in terms of equations for the sake of similarity to the previous theorem.

(2)$\Rightarrow$(3) Let $\mathbf{b}\in\mathbb{R}^m$. Then by (2) there exists an $\mathbf{a}\in\mathbb{R}^n$ such that $T(\mathbf{a})=\mathbf{b}$. We have: $$\mathbf{b} = T(\mathbf{a}) = A\mathbf{a} = A\left(\begin{array}{c}a_1\\a_2\\ \vdots\\a_n\end{array}\right) = a_1A_1 + a_2A_2 + \cdots + a_nA_n.$$ That is, we can express $\mathbf{b}$ as a linear combination of the columns of $A$. Since $\mathbf{b}$ is arbitrary, every vector in $\mathbb{R}^m$ can be expressed as a linear combination of the columns of $A$, so the columns of $A$ span $\mathbb{R}^m$; this proves (3).

(3)$\Rightarrow$(2) Suppose the columns of $A$ span $\mathbb{R}^m$ and let $\mathbf{b}\in\mathbb{R}^m$. We want to show that $T(\mathbf{x})=\mathbf{b}$ has at least one solution.

Since the columns of $A$ span $\mathbb{R}^m$, there exist scalars $\alpha_1,\ldots,\alpha_n$ such that $$\mathbf{b} = \alpha_1 A_1 + \cdots + \alpha_n A_n = A\left(\begin{array}{c}\alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n\end{array}\right) = T(\mathbf{\alpha}).$$ So $\mathbf{\alpha}$, where $$\mathbf{\alpha} = \left(\begin{array}{c}\alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n\end{array}\right),$$ is a solution to $T(\mathbf{x})=\mathbf{b}$. This establishes (2). QED

So: "one-to-one"-ness is related to linear independence; "onto"-ness is related to spanning properties. Note that linear independence is an intrinsic property (it depends only on the set of vectors), whereas spanning is an extrinsic property (it depends also on the space we are considering; it is contextual). This matches the fact that whether a function is one-to-one or not depends only on the domain, but whether it is onto depends on both the domain and the codomain of the function.

But there is a deep connection between the two. Remember the following:

Definition. Let $A$ be an $m\times n$ matrix. The nullity of $A$, $\mathrm{nullity}(A)$, is the dimension of the kernel of $A$, that is, of the subspace of $\mathbb{R}^n$ given by $$\mathrm{ker}(A) = \Bigl\{ \mathbf{x}\in\mathbb{R}^n\Bigm| A\mathbf{x}=\mathbf{0}\Bigr\}.$$ The rank of $A$, $\mathrm{rank}(A)$ is the dimension of the image of $A$; that is, of the subspace of $\mathbb{R}^m$ given by \begin{align*} \mathrm{Im}(A) &= \Bigl\{ \mathbf{b}\in\mathbb{R}^m\Bigm| A\mathbf{x}=\mathbf{b}\text{ has at least one solution}\Bigr\}\\\ &= \Bigl\{ A(\mathbf{x})\Bigm|\mathbf{x}\in\mathbb{R}^n\Bigr\}. \end{align*}

The deep connection between them is given by the Rank-Nullity Theorem:

Rank-Nullity Theorem. Let $A$ be an $m\times n$ matrix. Then $$\mathrm{rank}(A) + \mathrm{nullity}(A) = n.$$

Now we get two more equivalences for one-to-one and onto:

Theorem. Let $T\colon\mathbb{R}^n\to\mathbb{R^m}$ be a linear transformation. The following are equivalent:

  1. $T$ is one-to-one.
  2. The equation $T(\mathbf{x})=\mathbf{0}$ has only the trivial solution $\mathbf{x}=\mathbf{0}$.
  3. If $A$ is the standard matrix of $T$, then the columns of $A$ are linearly independent.
  4. $\mathrm{ker}(A) = \{\mathbf{0}\}$.
  5. $\mathrm{nullity}(A) = 0$.
  6. $\mathrm{rank}(A) = n$.

Proof. The equivalence of (4) and (5) follows because only the trivial subspace has dimension $0$; the equivalence of (4) and (2) follows by definition of the kernel. The equivalence of (5) and (6) follows from the Rank-Nullity Theorem, since $n = \mathrm{nullity}(A)+\mathrm{rank}(A)$, so $\mathrm{nullity}(A) = 0$ if and only if $\mathrm{rank}(A) = n$. Since we already know (1), (2), and (3) are equivalent, the result follows. QED

Theorem. Let $T\colon\mathbb{R}^n\to\mathbb{R}^m$ be a linear transformation. The following are equivalent:

  1. $T$ is onto.
  2. The equation $T(\mathbf{x})=\mathbf{b}$ has solutions for every $\mathbf{b}\in\mathbb{R}^m$.
  3. If $A$ is the standard matrix of $T$, then the columns of $A$ span $\mathbb{R}^m$. That is: every $\mathbf{b}\in\mathbb{R}^m$ is a linear combination of the columns of $A$.
  4. $\mathrm{Im}(A) = \mathbb{R}^m$.
  5. $\mathrm{rank}(A) = m$.
  6. $\mathrm{nullity}(A) = n-m$.

Proof. We already know that (1), (2), and (3) are equivalent. The equivalence of (4) and (2) follows by the definition of the image. The equivalence of (4) and (5) follows because the only subspace of $\mathbb{R}^m$ that has dimension $m$ is the whole space. Finally, the equivalence of (5) and (6) follows from the rank nullity theorem: since $n = \mathrm{rank}(A)+\mathrm{nullity}(A)$, then $\mathrm{nullity}(A) = n - \mathrm{rank}(A)$. So the rank equals $m$ if and only if the nullity equals $n-m$. QED

So now you have a whole bunch of ways of checking if a matrix is one-to-one, and of checking if a matrix is onto. None of them is "better" than the others: for some matrices, one will be easier to check, for other matrices, it may be a different one which is easy to check. Also, the rank of a matrix is closely related to its row-echelon form, so that might help as well.

Note a few things: generally, "onto" and "one-to-one" are independent of one another. You can have a matrix be onto but not one-to-one; or be one-to-one but not onto; or be both; or be neither. The Rank-Nullity Theorem does place some restrictions: if $A$ is $m\times n$ and $m\lt n$, then the matrix cannot be onto (because $1\leq\mathrm{rank}(A)\leq m$, so if $\mathrm{rank}(A)+\mathrm{nullity}(A) = n$, we must have $\mathrm{nullity}(A)\gt 0$); dually, if $m\gt n$ then $A$ cannot be onto. In particular, the only matrices that can be both one-to-one and onto are square matrices. On the other hand, you can have an $m\times n$ matrix with $m\lt n$ that is onto, or one that is not onto. And you can have $m\times n$ matrices with $m\gt n$ that are one-to-one, and matrices that are not one-to-one.


let $T(x)=Ax$ be a linear transformation.

$T(x)$ is one-to-one if the columns of $A$ are linearly independent.

$T(x)$ is onto if every row of $A$ has a pivot.