Vector derivation of $x^Tx$

Write x as $(x_1, x_2, \cdots, x_n)$. Then $x^t x = \sum_i x_i^2$. So, for example, $$\frac{d}{dx_1} x^t x = \frac{d}{dx_1} \left( \sum x_i^2\right) = \frac{d}{dx_1} x_1^2 = 2x_1$$ and similarly for each of the other components of $x$. From this it should be clear that $$\frac{d}{dx} x^t x = 2x^t$$ (The transpose is there because the derivative is a map $\mathbb{R}^n\rightarrow\mathbb{R}$, so expressed as a matrix it must have dimension $1\times n$, or alternatively, as a linear map it must live in the dual space to $\mathbb{R}^n$, i.e. the space of linear maps $\mathbb{R}^n \rightarrow \mathbb{R}$.)

Your question perhaps betrays some confusion as to what the derivative is. Although for each $x$ the value of $x^t x$ is a single number, i.e. a scalar, the derivative expresses the amount by which $x^t x$ changes as the entries of $x$ change. This is surely nonzero, since the value of $x^t x$ depends on the entries of $x$.


Let $u:\mathbb R^n\to\mathbb R$, $x\mapsto u(x)=x^Tx$. There exists a linear application $\ell_x:\mathbb R^n\to\mathbb R$, called the gradient of $u$ at $x$, such that

$$u(x+z)=u(x)+\ell_x(z)+o(\|z\|)$$

when $z\to0$. To compute $\ell_x$, note that $$ u(x+z)=(x+z)^T(x+z)=x^Tx+z^Tx+x^Tz+z^Tz=u(x)+2x^Tz+o(\|z\|), $$ hence $$ \ell_x(z)=2x^Tz. $$ Every linear form $\ell$ on $\mathbb R^n$ has the form $\ell:z\mapsto w^Tz$ for some $w$ in $\mathbb R^n$ hence one often identifies $\ell$ with $w$ (technically, this is identifying the dual of $\mathbb R^n$ with $\mathbb R^n$). In the present case, one may identify the gradient $\ell_x$ of $u$ at $x$ (a linear application from $\mathbb R^n$ to $\mathbb R$) with the vector $2x$ (an element of $\mathbb R^n$), and indeed, one often reads the formula $$ (\text{grad}\ u)(x)=2x. $$