Intuition on Directional derivative

By the chain rule,

$$\frac{d f(x+ta,y+tb)}{dt}=a\frac{\partial f(x+ta,y+tb)}{\partial x}+b\frac{\partial f(x+ta,y+tb)}{\partial y}.$$

Set $t=0$.


Given a function $f$ of one or more variables, if you pick an input $\mathbf x_0$ for the function $f$ (where we write $\mathbf x_0$ in bold face to indicate that it can be a vector of several variables), you can then define another function that is the change in $f$ as the input of $f$ changes away from $\mathbf x_0.$ That is, you can define a function that takes $\mathbf h,$ the amount by which we change the input, and produces an output value by the rule $$ \mathbf h \to f(\mathbf x_0 + \mathbf h) - f(\mathbf x_0).$$

The function $f$ is differentiable at $\mathbf x_0$ if you can use a linear function of $\mathbf h$ to approximate this "difference function" arbitrarily in a neighborhood around $\mathbf x_0,$ that is, if you only look at small enough changes.

The derivative of $f$ is the "multiplier" of that linear function.

If $f$ is a single-variable function, you can plot the graph $y = f(x)$ in two dimensions, and if you can put a line tangent to that graph at the point $(x=x_0, y = f(x_0)),$ it gives you a linear approximation of how much $f(x)$ varies as $x$ varies around $f(x_0),$ and the derivative of $f$ at that point, $f'(x_0),$ is the slope of the line.

In two dimensions you can plot the three-dimensional graph $z = f(x,y),$ and if you can put a plane tangent to that graph at the point $(x=x_0, y = y_0, z = f(x_0,y_0)),$ you again have a linear approximation of how much the value of $f$ changes as its input changes. But the "slope" of this plane cannot fully be described by a single number.

One way to describe the slope of the plane is by specifying the direction in which the plane is tilted and the slope $m$ of the plane in that direction. If you travel a distance $h$ in that direction the plane rises $mh.$ It falls an equal amount in the opposite direction, but if you travel perpendicular to that direction on the plane you don't rise or fall at all.

But since the plane rises or falls in a linear fashion depending on which direction you go and how far, you can measure its slope in the $x$ and $y$ directions and use those two numbers to find how much you rise or fall by traveling anywhere on the plane. The slope in the $x$ direction is $\frac{\partial f}{\partial x}$ and it tells you that if you travel from $(x_0,y_0)$ to $(x_0+h_x,y_0),$ the plane rises by $\frac{\partial f}{\partial x}h_x.$ To the extent that the plane is a good approximation of $f$ in the neighborhood of $(x_0,y_0),$ we can say $f(x_0,y_0) + \frac{\partial f}{\partial x}h_x$ is a good approximation of $f(x_0+h_x,y_0).$

Likewise, the slope in the $y$ direction is $\frac{\partial f}{\partial y}$; if you travel from $(x_0,y_0)$ to $(x_0,y_0+h_y),$ the plane rises by $\frac{\partial f}{\partial y}h_y$; and $f(x_0,y_0) + \frac{\partial f}{\partial y}h_y$ is an approximation of $f(x_0,y_0+h_y).$

What happens if you both travel $h_x$ in the $x$ direction and $h_y$ in the $y$ direction? Since the plane is the plot of a linear function, the change in height is just the sum of what you would get by going only in the $x$ direction and what you get by going only in the $y$ direction, that is, you reach the height $$f(x_0,y_0) + \frac{\partial f}{\partial x}h_x + \frac{\partial f}{\partial y}h_y,$$ which is an approximation of $f(x_0+h_x,y_0+h_y).$

Your directional vector $\vec v = a \hat\imath + b \hat\jmath$ says you travel a distance $h_x = a$ in the $x$ direction and $h_y = b$ in the $y$ direction, so the formula above gives an increase equal to $$\frac{\partial f}{\partial x}a + \frac{\partial f}{\partial y}b.$$

But when we say that $\vec v$ is a directional vector, we usually have in mind a unit vector, that is, $a^2 + b^2 = 1.$ Your intuition that a "nudge" of $\partial x$ in the $x$ direction and a "nudge" $\partial y$ in the $y$ direction add up to a "nudge" of $\sqrt{\partial x^2 +\partial y^2}$ is an accurate description of the magnitude of the combined "nudge," but it doesn't say anything about the direction of the "nudge." As the direction of the "nudge" gets closer to the $x$ direction, the $x$-direction slope $\frac{\partial f}{\partial x}$ becomes more important and the $y$-direction slope $\frac{\partial f}{\partial y}$ becomes less important to the change in height, and vice-vesa as the direction of the "nudge" gets closer to the $y$ direction. The linear function $a\frac{\partial f}{\partial x} + b\frac{\partial f}{\partial y}$ takes those relative influences into account.