Skew-symmetric, Time Dependent, Linear Ordinary Differential Equations

The iterative solution of the celebrated Magnus expansion for your equation $$\dot{\vec x}(t) = A(t) \vec x(t), \tag 4$$ uses the Ansatz $$ \vec x(t) = e^{\Omega(t,t_0)} ~ \vec x(t_0) $$ so that $$ A(t)= e^{-\Omega}\frac{d}{dt}e^{\Omega(t)} = \frac{1 - e^{-\mathrm{ad}_{\Omega}}}{\mathrm{ad}_{\Omega}}\frac{d\Omega}{dt}~~, $$ where ${\mathrm{ad}_{\Omega}} B \equiv [\Omega, B] $, $$ A= \dot \Omega - [\Omega, \dot \Omega]/2! + [\Omega,[\Omega,\dot \Omega]]/3! ~ - ~ ... $$

From (3), however, it follows that $\Omega$ is antisymmetric as well, $$ A^T=-A=\frac{d}{dt}e^{\Omega^T} ~e^{-\Omega^T}=- e^{\Omega^T} \frac{d}{dt}e^{-\Omega^T} = - e^{-\Omega} \frac{d}{dt}e^{\Omega} . $$

You may reassure yourself that, indeed, the nested integrals of commutators involved in the solution of $\Omega$ in terms of A yield antisymmetric results for each term of the iteration.


Let $X(t, t_0)$ be a fundamental solution matrix for the system

$\dot{\vec x}(t) = A(t) \vec x(t), \tag 1$

with

$A^T(t) = -A(t); \tag 2$

then $X(t, t_0)$ is a square matrix function of $t$ and we set

$n = \text{size} \; X(t, t_0) = \text{size} \; A(t); \tag 3$

the columns of $X(t, t_0)$ are $n \times 1$ matrices--"column vectors"--$\vec x(t)$ each of which satisfies (1); from this it easily follows that

$\dot X(t, t_0) = A(t) X(t, t_0), \tag 4$

since $A(t)$ acts on $X(t, t_0)$ column-by-column. We may transpose this equation and obtain

$\dot {X^T}(t, t_0) = X^T(t, t_0)A^T(t); \tag 5$

we next consider $X^T(t, t_0) X(t, t_0)$; we have

$\dfrac{d}{dt}(X^T(t, t_0) X(t, t_0)) = (X^T(t, t_0) X(t, t_0))'$ $= \dot {X^T}(t, t_0) X(t, t_0) + X^T(t, t_0) \dot X(t, t_0)$ $= X^T(t, t_0)A^T(t) X(t, t_0) + X^T(t, t_0)A(t)X(t, t_0) = X^T(t, t_0)(A^T(t) + A(t))X(t, t_0); \tag 6$

it is thus seen that in the event that (2) binds, and so

$A^T(t) + A(t) = 0, \tag 7$

that (6) implies

$\dfrac{d}{dt}(X^T(t, t_0) X(t, t_0)) = X^T(t, t_0)(A^T(t) + A(t))X(t, t_0)$ $= X^T(t, t_0)(0)X(t, t_0) = 0; \tag 8$

we infer from this that $X^T(t, t_0)X(t, t_0)$ is in fact a constant matrix:

$X^T(t, t_0)X(t, t_0) = X^T(t_0, t_0)X(t_0, t_0), \; \forall t \in I; \tag 9$

now suppose

$X(t_0, t_0) = I, \tag{10}$

the columns of which correspond to the $n$ vectors vectors of size $n$:

$\vec x_1(t_0) = (1, 0, \ldots, 0)^T, \tag{11}$

$\vec x_2(t_0) = (0, 1, \ldots, 0)^T, \tag{12}$

$\vdots \tag{13}$

$\vec x_n(t_0) = (0, 0, \ldots, 1)^T; \tag{14}$

which may serve as initial conditions for $n$ linearly independent solutions of (1); then

$X^T(t, t_0)X(t, t_0) = I, \tag{15}$

that is, $X(t, t_0)$ is an orthogonal matrix for all $t \in I$.

Conversely, given that (15) binds, we find upon differentiation with respect to $t$ that

$\dot {X^T}(t, t_0) X(t, t_0) + X^T(t, t_0) \dot X(t, t_0) = 0, \tag{16}$

so that in light of (4) and (5),

$X^T(t, t_0)A^T(t)X(t, t_0) + X^T(t, t_0)A(t)X(t, t_0) = 0, \tag{17}$

or

$X^T(t, t_0)(A^T(t) + A(t))X(t, t_0) = 0; \tag{18}$

in accord with (9), both $X^T(t, t_0)$ and $X(t, t_0)$ are non-singular; thus

$A^T(t) + A(t) = 0, \tag{19}$

that is, (2) also holds.

There are a few special applications of these results which are worthy of mention; for example, if $\vec x(t)$ satisfies (1)-(2), then

$\dfrac{d}{dt}\langle \vec x(t), \vec x(t) \rangle = \langle \dot{\vec x}(t), \vec x(t) \rangle + \langle \vec x(t), \dot{\vec x}(t) \rangle = \langle A(t)\vec x(t), \vec x(t) \rangle + \langle \vec x(t), A(t) \vec x(t) \rangle$ $= \langle \vec x(t), A^T(t) \vec x(t) \rangle + \langle \vec x(t), A(t) \vec x(t) \rangle = \langle \vec x(t), A^T(t) \vec x(t) + A(t) \vec x(t) \rangle$ $= \langle \vec x(t), -A(t) \vec x(t) + A(t) \vec x(t) \rangle = \langle \vec x(t), 0 \rangle = 0, \tag{20}$

which shows that $\langle \vec x(t), \vec x(t) \rangle$ is constant.

In a like manner we may take things a step further and write

$\dfrac{d}{dt}\langle \vec x(t), \vec y(t) \rangle = \langle \dot{\vec x}(t), \vec y(t) \rangle + \langle \vec x(t), \dot{\vec y}(t) \rangle = \langle A(t)\vec x(t), \vec y(t) \rangle + \langle \vec x(t), A(t) \vec y(t) \rangle$ $= \langle \vec x(t), A^T(t) \vec y(t) \rangle + \langle \vec x(t), A(t) \vec y(t) \rangle = \langle \vec x(t), A^T(t) \vec y(t) + A(t) \vec y(t) \rangle$ $= \langle \vec x(t), -A(t) \vec y(t) + A(t) \vec y(t) \rangle = \langle \vec x(t), 0 \rangle = 0, \tag{21}$

which shows that inner products are preserved under the flow of (1)-(2). Obviously (20) is a special case of (21) with $\vec y(t) = \vec x(t)$.

(20) and (21) also follow directly from (10) and (15), viz we have

$\vec x(t) = X(t, t_0) \vec x(t_0), \tag{22}$

$\vec y(t) = X(t, t_0) \vec y(t_0); \tag{23}$

thus,

$\langle \vec x(t), \vec y(t) \rangle = \langle X(t, t_0)\vec x(t_0), X(t, t_0) \vec y(t) \rangle = \langle \vec x(t_0), X^T(t, t_0) X(t, t_0) \vec y(t_0) \rangle$ $= \langle \vec x(t_0), I\vec y(t_0) \rangle = \langle \vec x(t_0), \vec y(t_0) \rangle, \tag{24}$

and of course taking $\vec y(t) = \vec x(t)$ yields

$\langle \vec x(t), \vec x(t) \rangle = \langle \vec x(t_0), \vec x(t_0) \rangle; \tag{25}$

no great surprise there.

We also note that, in accord with (6), $(X^T(t, t_0)X(t, t_0))'$ depends not only on $X(t, t_0)$ but also on $A_\Sigma(t)$, the symmetric part of $A(t)$:

$A_\Sigma(t) = \dfrac{A(t) + A^T(t)}{2}; \tag{26}$

the symmetric part of $A(t)$ vanishes when (2) or (7) bind; that is, when $A(t)$ is skew-symmetric.

Finally, in the event that (10) does not apply but that the columns of $X^T(t_0, t_0)$ remain linearly independent, we still have (9) with $X^T(t_0, t_0)X(t_0, t_0)$ positive definite symmetric, which may then be diagonalied by some orthogonal matrix $C$, yielding

$C^TX^T(t_0, t_0)X(t_0, t_0)C = \text{diag}(\mu_1, \mu_2, \ldots, \mu_n), \tag{27}$

whence

$X^T(t_0, t_0)X(t_0, t_0) = C\text{diag}(\mu_1, \mu_2, \ldots, \mu_n)C^T, \tag{28}$

with

$\mu_i > 0, \; 1 \le \mu_i \le n. \tag{29}$

We observe that (27) further implies

$\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t_0, t_0)$ $\cdot X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $= \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $\cdot \text{diag}(\mu_1, \mu_2, \ldots, \mu_n)\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = I, \tag{30}$

that is, $X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ is orthogonal. Furthermore, we have the equation

$\dfrac{d}{dt}(\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0) X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})) = (\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0) X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))' = \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^T\dot {X^T}(t, t_0) X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) + \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0) \dot X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $= \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0)A^T(t) X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) + \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0)A(t)X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0)(A^T(t) + A(t))X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = 0; \tag{31}$

combining (30) and (31) shows that

$\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t, t_0)$ $\cdot X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $= \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $\cdot \text{diag}(\mu_1, \mu_2, \ldots, \mu_n)\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = I, \tag{32}$

that is, $X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ is orthogonal for all $t$. These considerations indicate that the map

$X(t, t_0) \to X(t, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) \tag{33}$

transforms any fundamental solution matrix into an orthogonal one, having orthonormal rows and columns. Applied to $X(t_0, t_0)$, (33) reads

$X(t_0, t_0) \to X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}), \tag{34}$

where $X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ is orthogonal. Since this matrix is orthogonal, we have

$X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $\cdot (X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))^T = I, \tag{35}$

and

$(X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))^T$ $\cdot X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = I. \tag{36}$

We note that (35) may also be derived directly from (28) as follows:

$X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $\cdot (X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))^T = X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) \cdot \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^T X^T(t_0, t_0)$ $= X(t_0, t_0)C\text{diag}(\mu_1^{-1}, \mu_2^{-1}, \ldots, \mu_n^{-1}) C^T X^T(t_0, t_0); \tag{37}$

inverting (28) and recalling that $(C^T)^{-1} = C$ and $C^{-1} = C^T$ since $C$ is orthogonal,

$X^{-1}(t_0, t_0)(X^T(t_0, t_0))^{-1} = C\text{diag}(\mu_1^{-1}, \mu_2^{-1}, \ldots, \mu_n^{-1})C^T, \tag{38}$

and substituting this into (37) we see that

$X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $\cdot (X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))^T= X(t_0, t_0)X^{-1}(t_0, t_0)(X^T(t_0, t_0))^{-1} X^T(t_0, t_0) = I. \tag{39}$

It is somewhat easier to see (36), for

$(X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))^T$ $\cdot X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) $ $= \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})C^TX^T(t_0, t_0)$ $\cdot X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = I, \tag{40}$

and via (27) this becomes

$(X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}))^T$ $\cdot X(t_0, t_0)C\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $= \text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2})$ $\cdot \text{diag}(\mu_1, \mu_2, \ldots, \mu_n)\text{diag}(\mu_1^{-1/2}, \mu_2^{-1/2}, \ldots, \mu_n^{-1/2}) = I. \tag{41}$

In closing, we pay respect to the geometrical interpretation of these results; in particular, (20)-(21) und so weiter show that systems such as (1)-(2) preserve inner products, and hence magnitudes of, and angles between, vectors in $\Bbb R^n$. This view of things finds application in other situations, for example when considering the Frenet-Serret frames of curves in three-dimensional Euclidean space, and in higher dimensions as well.

Nota Bene, Thursday 17 July 2020 2:59 PM PST: A few words to clarify certain aspects of the preceding dicussion. We have claimed that (20), (21) follow directly from (10), (15) via (22) and (23). We expand upon these remarks by observing that (10) yields the identities

$\vec x(t_0) = X(t_0, t_0) \vec x(t_0), \tag{42}$

$\vec y(t_0) = X(t_0, t_0) \vec y(t_0), \tag{43}$

and right multiplying (4) by $\vec x(t_0)$, $\vec y(t_0)$ we obtain

$\dot X(t, t_0)\vec x(t_0) = A(t) X(t, t_0)\vec x(t_0), \tag{44}$

$\dot X(t, t_0)\vec y(t_0) = A(t) X(t, t_0)\vec y(t_0); \tag{45}$

since $\vec x(t)$ and $\vec y(t)$ each satisfy (1), as (44) and (45) show $X(t, t_0)\vec x(t_0)$, $X(t, t_0)\vec y(t_0)$ do also, we see that $\vec x(t)$ and $X(t, t_0)\vec x(t_0)$ and $\vec y(t)$, $X(t, t_0) \vec y(t_0)$ satisfy the same equation(s) (1), (44)-(45) with the same initial conditions (42)-(43), hence by uniqueness of solutions to ordinary differential equations,

$\vec x(t) = X(t, t_0)x(t_0), \tag{46}$

and

$\vec y(t) = X(t, t_0)y(t_0), \tag{47}$

End of Note.