Square root of positive definite nonsymmetric matrix

I believe $I+N$ must indeed be "positive definite".

Apply induction on the dimension. We are given a nilpotent matrix $N'$ which by Schur decomposition we can assume is strictly upper triangular. As a block matrix, $N'=\begin{pmatrix}N&x\\0&0\end{pmatrix}$ with $N$ a square strictly upper triangular matrix and $x$ a column vector. By assumption $(I+N')^2+(I+N')^{2t}$ is SPD (SPD=symmetric positive definite; superscript $2t$ means square and transpose). We wish to show that $(I+N')+(I+N')^t$ is SPD.

Let $U=I+N$ and note $$\begin{pmatrix}U&x\\0&1\end{pmatrix}^2+\begin{pmatrix}U&x\\0&1\end{pmatrix}^{2t}=\begin{pmatrix}U^2+U^{2t}&(I+U)x\\x^t(I+U)^t&2\end{pmatrix}.$$

By the Schur decomposition characterization of SPD we have by assumption:

  • $U^2+U^{2t}$ is SPD
  • $2-x^t(I+U)^t(U^2+U^{2t})^{-1}(I+U)x>0$

And we need to show:

  • $U+U^t$ is SPD
  • $2-x^t(U+U^t)^{-1}x>0.$

$U+U^t$ is SPD by induction. The latter inequality would follow from $$x^t(I+U)^t(U^2+U^{2t})^{-1}(I+U)x\geq x^t(U+U^t)^{-1}x$$ Letting $y=(I+U)x,$ this is $$y^t(U^2+U^{2t})^{-1}y\geq y^t(I+U^t)^{-1}(U+U^t)^{-1}(I+U)^{-1}y$$

This would follow from this result and symmetric positive definiteness of $$(I+U)(U+U^t)(I+U^t)-(U^2+U^{2t})=U+U^t+2UU^t+U(U+U^t)U^t$$ Since $U+U^t$ is SPD, so is $U(U+U^t)U^t,$ and $UU^t$ is certainly SPD. So we are done.


See loup blanc's answer for a generalization of the above argument from unipotent matrices to all matrices with positive spectrum.


Here is a reference with a nice proof: Uniqueness of matrix square roots and an application by Charles R. Johnson, Kazuyoshi Okubo, Robert Reams, Theorem 7. It uses the following theorem:

Theorem (Lyapunov). Let $A\in\mathbb C^{n\times n}$ (not necessarily Hermitian) and let $X\in\mathbb C^{n\times n}$ be Hermitian. If the eigenvalues of $A$ all have positive real part and $AX+XA^*$ is positive definite, then $X$ is positive definite.

In particular if $A$ has eigenvalues with positive real part and $A(A+A^*)+(A+A^*)A^*=A^2+(A^*)^2+2AA^*$ is positive definite, then $A+A^*$ is positive definite.

Proof 1 (sketch) following Horn and Johnson's Topics in Matrix Analysis:

Suppose not. We can take a kind of Jordan normal form, but making any above-diagonal $1$'s arbitrarily small, so $S^{-1}AS+(S^{-1}AS)^*$ is positive definite for some non-singular $S.$ Setting $G=SS^*$ we find that $AG+GA^*$ is positive definite. For $0\leq \theta\leq 1$ define $X_{\theta}=\theta G+(1-\theta)X.$ Note:

  • $X_0=X$ is not positive definite
  • $X_1=G$ is positive definite
  • $AX_\theta+X_\theta A^*$ is a convex combination of positive definite matrices so must be positive definite.

All the matrices $X_\theta$ have real eigenvalues, and by continuity some $X_\theta$ must have $0$ as an eigenvalue: $X_\theta v=0$ for some non-zero $v.$ This implies $v^*(AX_\theta+X_\theta A^*)v=0,$ contradicting positive definiteness of $AX_\theta+X_\theta A^*.$

Proof 2.

Consider the function defined by $f(W)=\int_0^\infty e^{-tA}We^{-tA^*}dt.$ We can compute

\begin{align*} W&=-\frac d{d\tau}\Bigr|_{\tau=0}\int_{\tau}^\infty e^{-tA}We^{-tA^*}dt\\ &=-\frac{d}{d\tau}\Bigr|_{\tau=0}\int_0^\infty e^{-\tau A}e^{-tA}We^{-tA^*}e^{-\tau A^*}dt\\ &=Af(W)+f(W)A^*. \end{align*}

This means $f$ is a left inverse of the map $X\mapsto AX+XA^*.$ Since $f$ is an $\mathbb R$-linear map from the space of Hermitian matrices to itself, and has a left inverse, it must be an isomorphism. So the solution $X$ of $AX+XA^*=W$ is unique. And if $W$ is positive definite, then so is $e^{-tA}W^{1/2}W^{1/2}e^{-tA^*},$ so $f(W)$ is positive definite by construction. This proves that if $AX+XA^*$ is positive definite then so is $X.$


A very partial answer.

Proposition 1. let $A$ be a P.D. matrix. Then there is a unique $B$ s.t. $B^2=A$ and every eigenvalue $\lambda$ of $B$ satisfies $Re(\lambda)>0$; moreover, if $A$ admits a P.D. square root, then necessarily it's $B$.

Proof. The key point is: if $U\in M_n$ is P.D., then every eigenvalue $\mu$ of $U$ satisfies $Re(\mu)>0$ (Beware, the converse is false!)

In particular, our $A$ has no $<0$ eigenvalues and, therefore, admits a unique square root $B$ s.t. every eigenvalue $\lambda$ of $B$ satisfies $Re(\lambda)>0$ (cf. Higham, functions of matrices). Thus $B$ is the only candidate that can be P.D.

Remark. That does not imply (despite Mathworld's article) that $A$ admits a P.D. square root.

EDIT 1. @Dap did a very pretty proof. I had thought of making such a recurrence, but I was sure that it would not work; which just shows that, in mathematics, you have to believe!

Moreover, using Dap's proof (mutatis mutandis), we can prove the following improvement

Proposition 2. Let $A\in M_n(\mathbb{R})$ be a P.D. matrix that satisfies $spectrum(A)\subset (0,+\infty)$. Then its principal square root (cf. Proposition 1.) is P.D.

Proof. Note that $B$ (the principal square root of $A$) has $>0$ eigenvalues and that it is triangularizable over $\mathbb{R}$ with a change of orthonormal basis.

Let $N'=\begin{pmatrix}N&x\\0&\alpha\end{pmatrix}$ (where $\alpha>0$) be the matrix $B$ after triangularization. We follow the Dap's proof.

We know that $N^2+N^{2T}>0,2\alpha^2-x^T(N+\alpha I)^T(N^2+N^{2T})^{-1}(N+\alpha I)x>0$ and we want to show that

$N+N^T>0,2\alpha-x^T(N+N^T)^{-1}x>0$.

It suffices to show that

$\Delta=x^T(N+\alpha I)^T(N^2+N^{2T})^{-1}(N+\alpha I)x- \alpha(x^T(N+N^T)^{-1}x)$ is non-negative.

We find $\Delta=N(N+N^T)N^T+2\alpha NN^T+\alpha^2(N+N^T)$ and we are done.

Remark 1. It remains to study the case when a P.D. matrix $A$ has non-real eigenvalues with positive real part.

Remark 2. Of course, Dap deserves the bounty.

EDIT 2. I just read the article ([1] by Johnson and all) cited by @Dap. Ewan will be happy; the result given by mathworld is true (what surprises me).

If $A\in M_n(\mathbb{C})$, let $F(A)=\{x^*Ax;||x||=1,x\in \mathbb{C}^n\}$ be its numerical range or its field of values. Note that $A$ is P.D. iff $F(A)\subset \{z;Re(z)>0\}$ iff $A+A^*$ is H.P.D..

[1] Theorem 7 (Kato,Masser, Neumann). If $A\in M_n(\mathbb{C})$ is s.t. $F(A)\cap (-\infty,0]=\emptyset$ (that is the case when $A$ is P.D.), then its principal square root is P.D..

[1] Corollary 8 (Johnson and all). If $A\in M_n(\mathbb{R})$ is s.t. $F(A)\cap (-\infty,0]=\emptyset$ (that is the case when $A$ is P.D. in the sense: for every $x\in \mathbb{R}^n\setminus\{0\}$, $x^TAx>0$), then its principal square root (which is a real matrix) is P.D.