Why can you mix Partial Derivatives with Ordinary Derivatives in the Chain Rule?

This is often confusing because there is a conflation of the symbol for a function argument as opposed to a function itself. For example, when you write $f(p) = p^2 + 4$, you are thinking of $f$ as a function and $p$ as the argument of that function, which could be any dummy variable. In fact, let us write $f(\xi) = \xi^2+4$, which is the same function with simply another symbol representing the rule that $f$ implies. At the same time, you are using the symbol $p$ as a function $p(x,y) = x^2 + 2y$. Now, with the functions $f(\xi)$ and $p(x,y)$ you have $u(x,y) = f \circ p\,(x,y)$; that is, $u$ is the composition of $f$ with $p$. Hence, using the chain rule and suppressing the variables $x$ and $y$, you have $$ \frac{\partial u}{\partial x} = f'(p)\, \frac{\partial p}{\partial x} = \frac{d f}{d \xi} (p) \, \frac{\partial p}{\partial x} $$ where $f' = \frac{d f}{d \xi}$ since we changed the argument symbol of $f$ to $\xi$ -- notice that $\frac{d f}{d\xi}$ is still evaluated at the function $p$ by the chain rule. If we wanted to to explicitly show where the variables $x$ and $y$ would manifest, we would have $$ \frac{\partial u}{\partial x}(x,y) = f'(p(x,y))\, \frac{\partial p}{\partial x}(x,y) = \frac{d f}{d \xi} (p(x,y)) \, \frac{\partial p}{\partial x}(x,y). $$ Hopefully this helps.


The given function $u$ has two variables $x$ and $y$. So, it makes sense to talk about partial derivatives $\frac{\partial u}{\partial x}$ and $\frac{\partial u}{\partial y}$. While you take another 'variable' $p$ for $u$, then you have a function of one variable, but which is dependent of other two variables, and so is $u$ then.

In other words you have: $$\frac{\partial u(p(x,y))}{\partial x}=\frac{\partial u(p)}{\partial p}\frac{\partial p(x,y)}{\partial x}$$ and $$\frac{\partial u(p(x,y))}{\partial y}=\frac{\partial u(p)}{\partial p}\frac{\partial p(x,y)}{\partial y}.$$