The meaning of $\lambda$ in Lagrange Multipliers

Suppose you want to maximize $z=f(x,y)$ subject to the constraint $g(x,y)=c$. You've used the method of Lagrange multipliers to have found the maximum $M$ and along the way have computed the Lagrange multiplier $\lambda$. Then $\lambda={dM\over dc}$, i.e. $\lambda$ is the rate of change of the maximum value with respect to $c$.

Said another way, you can think of $\lambda$ as approximately the change in $M$ that results from a one unit change in $c$.


Elaborating further: Optimizing $f(x,y)$ subject to $g(x,y)=c$ via Lagrange multipliers leads to $\nabla f=\lambda g$. Let $L(x,y;\lambda):=f(x,y)+\lambda(c-g(x,y))$. Then the constrained optimization problem can be cast as $\nabla L=0$.

From this perspective, ${\partial L\over \partial c}=\lambda$, i.e. $\lambda$ is the rate of change of the quantity being optimized, $L$, with respect to the constraint value, $c$.


An optimal value of $f(x,y)$ subject to the constraint $g(x,y)=c$ depends on the value of $c$. If you vary $c$, the optimal value of $f$, and its associated $\lambda$ will vary. For a given $c$ let $P(c)$ be the point at which $f$ is optimal, and $f(P(c))$ the optimal value, and $\lambda(c)$ the associated multiplier. How does $f(P(c))$ change as $c$ changes? The chain rule gives us the answer. Assuming that all the functions are differentiable $\frac{df(P(c))}{dc}=\nabla f(P(c)) \cdot P'(c)$ . By the Lagrange multiplier method we have $\nabla f(P(c))=\lambda(c)\nabla g(P(c))$. Hence we get $\frac{df(P(c))}{dc}=\nabla f(P(c)) \cdot P'(c) =\lambda(c) \nabla g(P(c))\cdot P'(c)$. Again by the chain rule, $\nabla g(P(c))\cdot P'(c)=\frac{dg(P(c))}{dc}=\frac{d(c)}{dc}=1$. And so $\frac{df(P(c))}{dc}=\lambda(c)$ . That is, as John D. says, the $\lambda(c)$ is the rate at which the optimal value of $f$ varies as $c$ varies.


I think lambda represents the marginal utility of the constraint. Suppose the gpa is function of studying maths and economics g(m,e), and the constraint is time (t) i.e. m+e=t. Now L=g(m,e)+λ(t-m-e). In this case, λ will represent the marginal utility of t, ie additional points in gpa that could be gained by increasing t by an hour.