Find the gradient and hessian of $f(Ax+b)$ for real value $f$ and matrix $A$

Background knowledge: if $F:\mathbb R^n \to \mathbb R^m$ is differentiable at $x$, then $F'(x)$ is an $m \times n$ matrix.


Let $g(x) = f(Ax + b)$. By the chain rule, $$ g'(x) = f'(Ax + b)A. $$ If we use the convention that the gradient is a column vector, then $$ \nabla g(x) = g'(x)^T = A^T \nabla f(Ax + b). $$ The Hessian of $g$ is the derivative of the function $x \mapsto \nabla g(x)$. By the chain rule, $$ \nabla^2 g(x) = A^T \nabla^2 f(Ax + b) A. $$


Hint

We have that

$$q(x_1,\dots, x_n)=f\left(\sum_{i=1}^n a_{1i}x_i+b_1,\dots,\sum_{i=1}^n a_{mi}x_i+b_m\right). $$

Thus we have

$$\dfrac{\partial q}{\partial x_i}=a_{1i}\dfrac{\partial f}{\partial u_1}+\dots +a_{mi}\dfrac{\partial f}{\partial u_m}.$$ That is

$$(\nabla q(x))^T=A(\nabla f q(x))^T.$$

Edit to get the Hessian

\begin{align} \dfrac{\partial^2 q}{\partial x_j\partial x_i} &=\sum_{k=1}^m a_{ki}\dfrac{\partial}{\partial x_j}\left( \dfrac{\partial f}{\partial u_k}\right) \\&= \sum_{k=1}^m a_{ki}\sum_{l=1}^ma_{lj}\dfrac{\partial^2 f}{\partial u_l\partial u_k} \\&= \sum_{k,l=1}^ma_{ki}a_{lj} \dfrac{\partial^2 f}{\partial u_l\partial u_k}. \end{align}

We have used that

$$\dfrac{\partial}{\partial x_j}\dfrac{\partial f}{\partial u_k}=\dfrac{\partial}{\partial u_1}\left(\dfrac{\partial f}{\partial u_k}\right)\dfrac{\partial u_1}{\partial x_j}+\dots +\dfrac{\partial}{\partial u_m}\left(\dfrac{\partial f}{\partial u_k}\right)\dfrac{\partial u_m}{\partial x_j}$$

Thus, whe have that

$$\nabla^2 q (x)=A^T (\nabla^2 f(u)) A.$$