Analytical formula for numerical derivative of the matrix pseudo-inverse?

The answer is known since at least 1973: a formula for the derivative of the pseudo-inverse of a matrix $A(x)$ of constant rank can be found in

The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables Separate. Author(s): G. H. Golub and V. Pereyra. Source: SIAM Journal on Numerical Analysis, Vol. 10, No. 2 (Apr., 1973), pp. 413-432

References 29 and 30 in the above paper contain an earlier formula that can also be used to obtain the same result (papers by P.A. Wedin).

The case of non-constant rank is simple: the pseudo-inverse is not continuous, in this case (see Corollary 3.5 in On the Perturbation of Pseudo-Inverses, Projections and Linear Least Squares Problems. G. W. Stewart. SIAM Review, Vol. 19, No. 4. (Oct., 1977), pp. 634-662).

Here is the formula for a matrix of constant rank (equation (4.12), in the Golub paper):

$$ \frac{\mathrm d}{\mathrm d x} A^+(x) = -A^+ \left( \frac{\mathrm d}{\mathrm d x} A \right) A^+ +A^+ A{^+}^T \left( \frac{\mathrm d}{\mathrm d x} A^T \right) (1-A A^+) + (1-A^+ A) \left( \frac{\mathrm d}{\mathrm d x} A^T \right) A{^+}^T A^+ $$

(for a real matrix).

For complex matrices, the above formula works if Hermitian conjugates are used instead of transposes. I don't have any reference on this (anyone?), but this is verified by all the numerical tests I did (with matrices of various shapes and ranks).


This is not a complete answer.

According to the Wikipedia page you linked, the pseudoinverse $A^+$ is not a continuous function of $A$, as it jumps around when $A$ is ill-conditioned. Therefore, you can't expect $A^+(x)$ to always have a derivative in terms of the matrix derivative of $A(x)$.

I suppose it may be reasonable to ask for a formula that works when the trajectory of $A(x)$ is restricted to constant rank strata, but I don't know such a formula. If it exists, it should blow up as you approach a place where the rank jumps down.

This is a shot in the dark, but you could try working out a limit of Tikhonov regularizations, taking the derivative of $(A^\ast(x)A(x) - \epsilon I)^{-1}A^\ast(x)$ as $\epsilon \to 0$.