Constrained optimization and Lagrange multipliers

From Wikipedia, the free encyclopedia

This page is a candidate to be copied to Wikibooks using the Import process.
If the page can be re-written into an encyclopedic article, please do so and remove this message.
Before you move this content to Wikibooks, verify that it conforms to Wikibooks policies of acceptable content at What is Wikibooks? Often content unacceptable to Wikipedia may not be acceptable on Wikibooks either; facilitate the copying of this article by listing it on Wikibooks:Requests for Import.

This tutorial presents an introduction to optimization problems that involve finding a maximum or a minimum value of an objective function $f(x_1,x_2,\ldots, x_n)$ subject to a constraint of the form $g(x_1,x_2,\ldots, x_n)=k$ .

1 Maximum and minimum
- 1.1 Example
2 The Hessian
3 Second derivative test
4 Constrained maximum and minimum
5 Lagrange multiplier method
6 References

[edit] Maximum and minimum

Finding optimum values of the function $f(x_1,x_2,\ldots, x_n)$ without a constraint is a well known problem dealt with in calculus courses. One would normally use the gradient to find stationary points. Then check all stationary and boundary points to find optimum values.

[edit] Example

$f (x, y) = 2 x 2 + y 2$
$f x (x, y) = 4 x = 0$
$f y (x, y) = 2 y = 0$

$f (x, y)$ has one stationary point at (0,0).

[edit] The Hessian

A common method of determining weather or not a function has an extreme value at a stationary point is to evaluate the hessian[3] of the function at that point. where the hessian is defined as

$H(f)= \begin{bmatrix} \frac{{\partial}^2 f}{{\partial}^2 x_1} & \frac{{\partial}^2 f}{\partial x_1 \partial x_2} & \dots & \frac{{\partial}^2 f}{\partial x_1 \partial x_n} \\ \frac{{\partial}^2 f}{\partial x_2 \partial x_1} & \frac{{\partial}^2f}{{\partial}^2 x_2}& \dots & \frac{{\partial}^2f}{\partial x_n \partial x_n}\\ \vdots & \vdots & \ddots & \vdots \\ \frac{{\partial}^2f}{\partial x_n \partial x_1} & \frac{{\partial}^2f}{\partial x_n \partial x_2}& \dots & \frac{{\partial}^2f}{{\partial}^2 x_n}\\ \end{bmatrix}.$

[edit] Second derivative test

The Second derivative test determines the optimality of stationary point $x$ according to the following rules [2]:

If $H (f) > 0$ at point x then $f$ has a local minimum at x
If $H (f) < 0$ at point x then $f$ has a local maximum at } x
If $H (f)$ has negative and positive eigenvalues then x is a saddle point
Otherwise the test is inconclusive

In the above example.

$H(f)=\begin{bmatrix} 4 & 0\\ 0& 2 \end{bmatrix}.$

Therefore $f (x, y)$ has a minimum at (0,0).

[edit] Constrained maximum and minimum

When finding the extreme values of $f(x_1,x_2,\cdots, x_n)$ subject to a constraint $g(x_1,x_2,\cdots, x_n)=k$ , the stationary points found above will not work. This new problem can be thought of as finding extreme values of $f(x_1,x_2,\dots, x_n)$ when the point $(x_1,x_2,\dots, x_n)$ is restricted to lie on the surface $g(x_1,x_2,\dots, x_n)=k$ . The value of $f(x_1,x_2,\dots, x_n)$ is maximized(minimized) when the surfaces touch each other,i.e , they have a common tangent line. This means that the surfaces gradient vectors at that point are parallel, hence,

$\nabla f(x_1,x_2,\dots, x_n)= \lambda \nabla g(x_1,x_2,\dots, x_n).\,$

The number $λ$ in the equation is known as the Lagrange multiplier.

[edit] Lagrange multiplier method

The Lagrange multiplier methods solves the constrained optimization problem by transforming it into a non-constrained optimization problem of the form:

$L(x_1,x_2,\ldots, x_n,\lambda)= f(x_1,x_2,\ldots, x_n)+\lambda (k-g(x_1,x_2,\ldots, x_n))$

Then finding the gradient and hessian as was done above will determine any optimum values of $L(x_1,x_2,\ldots, x_n,\lambda)$ .

Suppose we now want to find optimum values for $f (x, y) = 2 x 2 + y 2$ subject to $x + y = 1$ from [2].

Then the Lagrangian method will result in a non-constrained function.

$L (x, y,λ) = 2 x 2 + y 2 + λ(1 - x - y)$

The gradient for this new function is

$\frac{\partial L}{\partial x}(x,y,\lambda)= 4x+\lambda (-1)=0$
$\frac{\partial L}{\partial y}(x,y,\lambda)= 2y+\lambda (-1)=0$
$\frac{\partial L}{\partial \lambda}(x,y,\lambda)=1-x-y=0$

Finding the stationary points of the above equations can be obtained from their matrix from.

$\begin{bmatrix} 4 & 0 & -1 \\ 0& 2 & -1 \\ 1 & 1 & 0 \end{bmatrix} \begin{bmatrix} x\\ y \\ \lambda \end{bmatrix}= \begin{bmatrix} 0\\ 0\\ 1 \end{bmatrix}$