Solving a system of linear equations in a non-square matrix

Whether or not your matrix is square is not what determines the solution space. It is the rank of the matrix compared to the number of columns that determines that (see the rank-nullity theorem). In general you can have zero, one or an infinite number of solutions to a linear system of equations, depending on its rank and nullity relationship.

To answer your question, however, you can use Gaussian elimination to find the rank of the matrix and, if this indicates that solutions exist, find a particular solution x0 and the nullspace Null(A) of the matrix. Then, you can describe all your solutions as x = x0 + xn, where xn represents any element of Null(A). For example, if a matrix is full rank its nullspace will be empty and the linear system will have at most one solution. If its rank is also equal to the number of rows, then you have one unique solution. If the nullspace is of dimension one, then your solution will be a line that passes through x0, any point on that line satisfying the linear equations.


Ok, first off: a non-square system of equations can have an exact solution

[ 1 0 0 ][x] = [1]
[ 0 0 1 ][y]   [1]
         [z] 

clearly has a solution (actually, it has an 1-dimensional family of solutions: x=z=1). Even if the system is overdetermined instead of underdetermined it may still have a solution:

[ 1 0 ][x] = [1]
[ 0 1 ][y]   [1]
[ 1 1 ]      [2]

(x=y=1). You may want to start by looking at least squares solution methods, which find the exact solution if one exists, and "the best" approximate solution (in some sense) if one does not.


Taking Ax = b, with A having m columns and n rows. We are not guaranteed to have one and only one solution, which in many cases is because we have more equations than unknowns (m bigger n). This could be because of repeated measurements, that we actually want because we are cautious about influence of noise.

If we observe that we can not find a solution that actually means, that there is no way to find b travelling the column space spanned by A. (As x is only taking a combination of the columns).

We can however ask for the point in the space spanned by A that is nearest to b. How can we find such a point? Walking on a plane the closest one can get to a point outside it, is to walk until you are right below. Geometrically speaking this is when our axis of sight is perpendicular to the plane.

Now that is something we can have a mathematical formulation of. A perpendicular vector reminds us of orthogonal projections. And that is what we are going to do. The simplest case tells us to do a.T b. But we can take the whole matrix A.T b.

For our equation let us apply the transformation to both sides: A.T Ax = A.T b. Last step is to solve for x by taking the inverse of A.T A:

x = (A.T A)^-1 * A.T b