How to calculate the gradient of $x^T A x$?

\begin{align*} dy & = d(x^{T}Ax) = d(Ax\cdot x) = d\left(\sum_{i=1}^{n}(Ax)_{i}x_{i}\right) \\ & = d \left(\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j}x_{j}x_{i}\right) =\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j}x_{i}dx_{j}+\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j}x_{j}dx_{i} \\ & =\sum_{i=1}^{n}(Ax)dx_{i}+\sum_{i=1}^{n}(Adx)x_{i} =(dx)^{T}Ax+x^{T}Adx \\ & =(dx)^{T}Ax+(dx)^{T}A^{T}x =(dx)^{T}(A+A^{T})x. \end{align*}

Step 2 might be the result of a simple computation. Consider $u(x)=x^TAx$, then $$ u(x+h)=(x+h)^TA(x+h)=x^TAx+h^TAx+x^TAh+h^TAh, $$ that is, $u(x+h)=u(x)+x^T(A+A^T)h+r_x(h)$ where $r_x(h)=h^TAh$ (this uses the fact that $h^TAx=x^TA^Th$, which holds because $m=h^TAx$ is a $1\times1$ matrix hence $m^T=m$).

One sees that $r_x(h)=o(\|h\|)$ when $h\to0$. This proves that the differential of $u$ at $x$ is the linear function $\nabla u(x):\mathbb R^n\to\mathbb R$, $h\mapsto x^T(A+A^T)h$, which can be identified with the unique vector $z$ such that $\nabla u(x)(h)=z^Th$ for every $h$ in $\mathbb R^n$, that is, $z=(A+A^T)x$.

Here's a method which calculates the gradient of $x^TAx$ without using the exterior derivative. I know that this is not what you are after, but it is worth noting how to prove it without the exterior derivative. This also allows for comparison with the exterior derivative method to see how much easier it is.

Let $A$ be $n\times n$, $A = [a_{ij}]$. If $x \in \mathbb{R}^n$, $x = (x_1, \dots, x_n)^T$, then $y = \displaystyle\sum_{i=1}^n\sum_{j=1}^na_{ij}x_ix_j$.

Then we have

\begin{align*} \dfrac{\partial y}{\partial x_k} &= \sum_{i\neq k}\dfrac{\partial}{\partial x_k}\left(\sum_{j=1}^na_{ij}x_ix_j\right) + \dfrac{\partial}{\partial x_k}\left(\sum_{j=1}^na_{kj}x_kx_j\right)\\ &=\sum_{i\neq k}\left(\dfrac{\partial}{\partial x_k}\left(\sum_{j\neq k}a_{ij}x_ix_j\right) + \dfrac{\partial}{\partial x_k}(a_{ik}x_ix_k)\right) + \sum_{j\neq k}\dfrac{\partial}{\partial x_k}(a_{kj}x_kx_j) + \dfrac{\partial}{\partial x_k}(a_{kk}x_k^2)\\ &= \sum_{i\neq k}a_{ik}x_i + \sum_{j\neq k}a_{kj}x_j + 2a_{kk}x_k\\ &= \sum_{i = 1}^na_{ik}x_i + \sum_{j=1}^na_{kj}x_j\\ &= (x^TA)_k + (Ax)_k \end{align*}

where $(x^TA)_k$ is the $k^{\text{th}}$ component of the row vector $x^TA$ and $(Ax)_k$ is the $k^{\text{th}}$ component of the column vector $Ax$. By taking the transpose of $Ax$ we obtain the row vector $x^TA^T$ which has the same $k^{\text{th}}$ component as $Ax$ does. Therefore $\dfrac{\partial y}{\partial x_k} = (x^TA)_k + (x^TA^T)_k$. Therefore

$$\nabla y = x^TA + x^TA^T = x^T(A + A^T).$$

How to calculate the gradient of $x^T A x$?

Tags:

Derivatives

Matrices

Quadratic Forms

Differential Forms

Multivariable Calculus

Related

Recent Posts