Why is uncomputability of the spectral decomposition not a problem?

The singular value decomposition, when applied to a real symmetric matrix $A = \sum_i \lambda_i(A) u_i(A) u_i(A)^T$, computes a stable mathematical object (spectral measure $\mu_A = \sum_i \delta_{\lambda_i(A)} u_i(A) u_i(A)^T$, which is a projection-valued measure) using a partially unstable coordinate system (the eigenvalues $\lambda_i(A)$ and eigenvectors $u_i(A)$; the eigenvalues are stable, but the eigenvectors are not). The numerical instability of the latter reflects the coordinate singularities of this coordinate system, but does not contradict the stability of the former. But in numerical computations we have to use the latter rather than the former, because standard computer languages have built-in data representations for numbers and vectors, but usually do not have built-in data representations for projection-valued measures.

An analogy is with floating-point arithmetic. The operation of multiplication of two floating point numbers (expressed in binary $x = \sum_i a_i(x) 2^{-i}$ or decimal $x = \sum_i b_i(x) 10^{-i}$) is a stable (i.e., continuous) operation on the abstract real numbers ${\bf R}$, but when viewed in a binary or decimal representation system becomes "uncomputable". For instance, the square of $1.414213\dots$ could be either $1.99999\dots$ or $2.0000\dots$, depending on exactly what is going on in the $\dots$; hence questions such as "what is the first digit of the square of $1.414213\dots$" are uncomputable. But this is an artefact of the numeral representation system used and is not an indicator of any lack of stability or computability for any actual computational problem that involves the abstract real numbers (rather than an artificial problem that is sensitive to the choice of numeral representation used). In contrast, floating point division when the denominator is near zero is a true singularity; regardless of what numeral system one uses, this operation is genuinely discontinuous (in a dramatic fashion) on the abstract reals and generates actual instabilities that cannot be explained away as mere coordinate singularity artefacts.

Returning back to matrices, whereas the individual eigenvectors $u_i(A)$ of a real symmetric matrix $A$ are not uniquely defined (there is a choice of sign for $u_i(A)$, even when there are no repeated eigenvalues) or continuously dependent on $A$, the spectral measure $\mu_A := \sum_i \delta_{\lambda_i(A)} u_i(A) u_i(A)^T$ is unambiguous; it is the unique projection-valued measure for which one has the functional calculus $$ f(A) = \int_{\bf R} f(E)\ d\mu_A(E)$$ for any polynomial $f$ (or indeed for any continuous function $f \colon {\bf R} \to {\bf R}$). The spectral measure $\mu_A$ depends continuously on $A$ in the vague topology; indeed one has the inequality $$ \| f(A) - f(B) \|_F \leq \|f\|_\text{Lip} \|A-B\|_F$$ for any real symmetric $A,B$ and any Lipschitz $f$, where $\|\|_F$ denotes the Frobenius norm (also known as the Hilbert-Schmidt norm or 2-Schatten norm). This allows for the possibility for stable computation of this measure, and indeed standard algorithms such as tridiagonalisation methods using (for instance) the QR factorisation and Householder reflections do allow one to compute this measure in a numerically stable fashion (e.g., small roundoff errors only lead to small variations in any test $\int_{\bf R} f(E)\ d\mu_A(E)$ of the spectral measure $\mu_A$ against a given test function $f$), although actually demonstrating this stability rigorously for a given numerical SVD algorithm does require a non-trivial amount of effort.

The practical upshot of this is that if one uses a numerically stable SVD algorithm to compute a quantity that can be expressed as a numerically stable function of the spectral measure (e.g., the inverse $A^{-1}$, assuming that the spectrum is bounded away from zero), then the computation will be stable, despite the fact that the representation of this spectral measure in eigenvalue/eigenvector form may contain coordinate instabilities. In examples involving eigenvalue collision such as the one you provided in your post, the eigenvectors can change dramatically (while the eigenvalues remains stable), but when the time comes to apply the SVD to compute a stable quantity such as the inverse $A^{-1}$, these dramatic changes "miraculously" cancel each other out and the algorithm becomes numerically stable again. (This is analogous to how a stable floating point arithmetic computation (avoiding division by very small denominators) applied to an input $x = 1.99999\dots$ and an input $x' = 2.00000\dots$ will lead to outcomes that are very close to each other (as abstract real numbers), even though all the digits in the representations of $x$ and $x'$ are completely different; the changes in digits "cancel each other out" at the end of the day.)

[The situation is a bit more interesting when applying the SVD to a non-symmetric matrix $A = \sum_i \sigma_i(A) u_i(A) v_i(A)^T$. Now one gets two spectral measures, $\mu_{(A^* A)^{1/2}} = \sum_i \delta_{\sigma_i(A)} v_i(A) v_i(A)^T$ and $\mu_{(AA^*)^{1/2}} = \sum_i \delta_{\sigma_i(A)} u_i(A) u_i(A)^T$ which are numerically stable, but these don't capture the full strength of the SVD (for instance, they are not sufficient for computing $A^{-1}$). The non-projection-valued spectral measure $\mu_A = \sum_i \delta_{\sigma_i(A)} u_i(A) v_i(A)^T$ does capture the full SVD in this case, but is only stable using the vague topology on the open half-line $(0,+\infty)$, that is to say $\int_0^\infty f(E)\ d\mu_A(E)$ varies continuously with $A$ as long as $f$ is a test function compactly supported in $(0,+\infty)$, but is unstable if tested by functions that do not vanish at the origin. This is ultimately due to a genuine singularity in the polar decomposition of a non-selfadjoint matrix when the matrix becomes singular, which in one dimension is simply the familiar singularity in the polar decomposition of a complex number near the origin.]

The SVD decomposition falls under the family of phenomena where discontinuity implies non-computability. (Intuitively, this is because at the point of discontinuity infinite precissions is required.)

In this particular case we speak of the (dis)continuity of a multivalued function which takes a matrix to any of its decompositions, or better, the non-existence of a realizer for the statement "For every matrix $M$ there exist suitable $U$, $\Sigma$, $V$ yielding an SVD decomposition of $M$." I belive this statement has no conitnuous realizer in function realizability, and hence no computable one either.

Some other examples of this phenomenon are:

The sign function $\mathrm{sgn} : \mathbb{R} \to \{-1,0,1\}$ is discontinuous, therefore non-computable. In fact, every computable map $\mathbb{R} \to \{0,1\}$ is constant.
The rank of a matrix is non-computable.
Gaussian elimination (as taught in school) cannot be performed because testing for zero is non-computable.
The number of distinct zeroes of a polynomial is non-computable.

So why are these, along with your observation, a problem? There are several answers, depending on the context.

In floating-point numerics, all computations are done with a fixed finite precision and numerical errors are simply unavoidable. In this setting the non-computability manifests itself as numerical instability. In your case, we might simply compute the wrong decomposition.

In some situations we can restrict to computation in a subring of $\mathbb{R}$ in which the problem disappears. For example, many of the above problems are non-existent when we restrict to $\mathbb{Q}$ or the algebraic numbers.

In exact real-arithmetic there are no numerical errors, as precision always adapts automatically to achieve the desired result. In this setting non-computability really is non-computability. The algorithm will diverge at points of discontinuity. In your case, it will just run forever trying to determine in which of the two cases it is.

There are models of real-number computation that pretend we can perform exact zero-testing, notably the Blum-Shub-Smale model. They are often used in computational geometry to side-step questions about non-computability. There are various theorems guaranteeing that a small perturbation of the input can get us out of trouble, at the price of possibly computing the wrong result.

This is primarily an issue of backwards vs. forwards stability. Good SVD algorithms are backwards stable in the sense that the computed singular values and singular vectors are the true singular values and singular vectors of a slightly perturbed problem. You may see this by noting that while $P$ may change drastically as you change $\epsilon$, the product $PDP^T$ changes negligibly.

The SVD is not forwards stable when the singular values have small spectral gap, as your example demonstrates and other answers here discuss in more detail.

For more on backwards and forwards stability, see, e.g., this post and the links therein: https://math.stackexchange.com/a/78907/3060

SCIPY uses LAPACK; some details on the stability of the algorithm are provided here: https://www.netlib.org/lapack/lug/node97.html

Why is uncomputability of the spectral decomposition not a problem?

Tags:

Linear Algebra

Matrices

Computability Theory

Numerical Linear Algebra

Related

Recent Posts