To invert a Matrix, Condition number should be less than what?

The problem of fitting a linear function which minimises the residuals is given by $$\min\limits_\beta \|X\beta-y\|_2^2,$$ which corresponds to solving the linear system $X\beta=\mathcal{P}_X(y)$. Here $\mathcal{P}_X(y)$ is the projection of $y$ onto the space spanned by the columns of $X$. This corresponds to the linear system $X^TX\beta=X^Ty$.

The columns of $X$ are linearly dependent when there are two variables which are perfectly correlated; in that case, $X^TX$ is singular i.e. $\kappa(X^TX)=\infty$. Usually this will not occur and the correlation is not perfect, but there is still significant correlation between two variables. This corresponds to a large condition number, but not infinite. See also the comment by Mario Carneiro.

In terms of MATLAB computation, the smallest floating point value is approximately $\epsilon=2.26\times10^{-16}$. You comment that a condition number of $10^k$ loses $k$ digits of precision indicates that the condition number should be less than $1/{\epsilon}$. MATLAB's mldivide function will warn you if this is the case.

To solve this problem, you have proposed using the normal equations. A more numerically stable algorithm is to use a qr factorisation; this is the approach taken by mldivide.


Mathematically, if the condition number is less than $\infty$, the matrix is invertible. Numerically, there are roundoff errors which occur. A high condition number means that the matrix is almost non-invertible. For a computer, this can be just as bad. But there is no hard bound; the higher the condition number, the greater the error in the calculation. For very high condition number, you may have a number round to 0 and then be inverted, causing an error. This is the same thing that would happen if you tried to invert a truly non-invertible matrix, which is why I say that a high condition number may as well be infinite in some cases.

In order to find out if the matrix is really too ill-conditioned, you should invert the matrix, and then check that $AA^{-1}=I$, to an acceptable precision. There is simply no hard cap on the condition number, just heuristics, which is why your references differ.