Frechet derivative of a composition of functions over matrices

It would be really messy but you can utilize the "naive"(in the numerical sense) solution of the Lyapunov equation which is $\textrm{vec}(X) = (I-A^T\otimes A)^{-1}\textrm{vec}(Q)$ and trace condition is a row vector of $1$s and $0$s whose $1$s hitting every diagonal element on $\textrm{vec}(X)$.

Hence the explicit (again theoretical) expression for $g:D\to \mathbb{R}$ is

$$ g(A) = \begin{bmatrix}1&0&\cdots&0&\color{red}{0}&\color{red}{1}&\color{red}{0}&\cdots&\color{red}{0}&\color{blue}{0}&\color{blue}{0}&\color{blue}{1}&\cdots\end{bmatrix}(I-A^T\otimes A)^{-1}\textrm{vec}(Q) $$

with colors, trying to encode the entries multiplying each row group, resembling the log det problems.


Assume a small variation $\Delta A$ of the elements of $A$. Then, for the the new solution $X+\Delta X$ of the Lyapunov equation we have $$(A+\Delta A)^T(X+\Delta X)(A+\Delta A)+Q=X+\Delta X \qquad \qquad(1)$$ Taking into account the unperturbed equation $A^TXA+Q=X$ for small variations $\Delta A$ (we ignore second order terms) we obtain $$(\Delta A)^T X A+A^T X(\Delta A)=\Delta X-A^T(\Delta X)A\qquad \qquad (2)$$ Consider a variation $\Delta a_{ij}$ of the $(i,j)$-element in $A$. Then, this variation will induce a variation $\Delta_{i,j} X$ (this is a slight abuse of notation to differentiate on the effects of the different element variations) on $X$ that should satisfy $$\Delta a_{ij}(e_j e_i^T X A+A^T Xe_ie_j^T)=\Delta_{i,j} X-A^T(\Delta_{i,j} X)A\qquad \qquad(3)$$ where $e_i$ is the $i$-th column of the identity matrix. Since $\Delta [tr(X)]=tr(\Delta X)$ the desired matrix $$S=\frac{\partial [tr(X)]}{\partial A}$$ will have elements given by

$$S_{ij}=\lim_{\Delta a_{ij}\rightarrow 0}\frac{tr(\Delta_{i,j}X)}{\Delta a_{ij}}$$

Applying the vec operator in (3) we obtain $$vec(\Delta_{i,j}X)=(\mathbb{I}-A^T\otimes A^T)^{-1}vec(A^TXe_ie_j^T+e_je_i^TXA)\Delta a_{ij}$$ For the trace we have $$tr(\Delta_{i,j}X)=vec^T(\mathbb{I})vec(\Delta_{i,j}X)=vec^T(\mathbb{I})(\mathbb{I}-A^T\otimes A^T)^{-1}vec(A^TXe_ie_j^T+e_je_i^TXA)\Delta a_{ij}$$ and therefore

$$S_{ij}=vec^T(\mathbb{I})(\mathbb{I}-A^T\otimes A^T)^{-1}vec(A^TXe_ie_j^T+e_je_i^TXA)$$


Define the variables $$\eqalign{ M &= (I\otimes I-A\otimes A) \in {\mathbb R}^{n^2\times n^2} \cr x &= {\rm vec}(X),\,\,q={\rm vec}(Q),\,\,\,y = {\rm vec}(I)\,\in {\mathbb R}^{n^2} \cr }$$ Then we can rearrange and vectorized the Lyapunov equation $$\eqalign{ Q &= X - A^TXA \cr q &= M^Tx \cr }$$ Taking the differential yields the relationship between $dx$ and $dA$ $$\eqalign{ M^Tdx &= -dM^Tx \cr dx &= M^{-T}(dA\otimes A+A\otimes dA)^Tx \cr }$$ The function we are actually interested in is $$\phi={\rm tr}(X)=I:X$$ where the colon denotes the trace/Frobenius product, i.e. $\,\,A:B\equiv{\rm tr}(A^TB)$.

Take the differential of this function $$\eqalign{ d\phi &= I:dX = y:dx = y^T:dx^T \cr &= y^T:x^T(dA\otimes A+A\otimes dA)M^{-1} \cr &= xy^TM^{-T}:(dA\otimes A+A\otimes dA) \cr }$$ Now we need to decompose the LHS of the product into a sum of Kronecker factors $$\eqalign{ xy^TM^{-1} &= \sum_{k=1}^r B_k\otimes C_k \cr B_k,C_k &\in {\mathbb R}^{n\times n} }$$ We also need to know the rule for a Kronecker-Frobenius mixed product $$(A\otimes B\otimes C):(X\otimes Y\otimes Z)=(A:X)\,(B:Y)\,(C:Z)$$ Substitute the Kronecker factorization into the differential to obtain our final result $$\eqalign{ d\phi &= \sum_{k=1}^r B_k\otimes C_k:(dA\otimes A+A\otimes dA) \cr &= \bigg(\sum_{k=1}^r (A:B_k)C_k + (A:C_k)B_k\bigg):dA \cr\cr S &= \frac{\partial\,{\rm tr}(X)}{\partial A} \cr &= \sum_{k=1}^r (A:B_k)C_k + (A:C_k)B_k \cr &= \sum_{k=1}^r {\rm tr}(A^TB_k)C_k + {\rm tr}(A^TC_k)B_k \cr\cr }$$ For more information about the Kronecker product factorization, look for papers by Pitsianis and vanLoan. It turns out to be yet another (albeit clever) application of the SVD.