Subgradient method

Subgradient methods are algorithms for solving convex optimization problems. Originally developed by Naum Z. Shor and others in the 1960s and 1970s, subgradient methods can be used with a non-differentiable objective function. When the objective function is differentiable, subgradient methods for unconstrained problems use the same search direction as the method of steepest descent.

Although subgradient methods can be much slower than interior-point methods and Newton's method in practice, they can be immediately applied to a far wider variety of problems and require much less memory. Moreover, by combining the subgradient method with primal or dual decomposition techniques, it is sometimes possible to develop a simple distributed algorithm for a problem.

Basic subgradient update

Let $f:mathbb{R}^n o mathbb{R}$ be a convex function with domain $mathbb{R}^n$ . The subgradient method uses the iteration: $x^{(k+1)} = x^{(k)} - alpha_k g^{(k)}$ where $g^{(k)}$ denotes a subgradient of $f$ at $x^{(k)}$ . If $f$ is differentiable, its only subgradient is the gradient vector $abla f$ itself.It may happen that $-g^{(k)}$ is not a descent direction for $f$ at $x^{(k)}$ . We therefore maintain a list $f_{
m{best$ that keeps track of the lowest objective function value found so far, i.e.: $f_{
m{best^{(k)} = min{f_{
m{best^{(k-1)} , f(x^{(k)}) }.$

tep size rules

Many different types of step size rules are used in the subgradient method. Five basic step size rules for which convergence is guaranteed are:

*Constant step size, $alpha_k = alpha.$
*Constant step length, $alpha_k = gamma/lVert g^{(k)}
Vert_2$ , which gives $lVert x^{(k+1)} - x^{(k)}
Vert_2 = gamma.$
*Square summable but not summable step size, i.e. any step sizes satisfying: $alpha_kgeq0,qquadsum_{k=1}^infty alpha_k^2 < infty,qquad sum_{k=1}^infty alpha_k = infty.$
*Nonsummable diminishing, i.e. any step sizes satisfying: $alpha_kgeq0,qquad lim_{k oinfty} alpha_k = 0,qquad sum_{k=1}^infty alpha_k = infty.$
*Nonsummable diminishing step lengths, i.e. $alpha_k = gamma_k/lVert g^{(k)}
Vert_2$ , where: $gamma_kgeq0,qquad lim_{k oinfty} gamma_k = 0,qquad sum_{k=1}^infty gamma_k = infty.$ Notice that the step sizes listed above are determined before the algorithm is run and do not depend on any data computed during the algorithm. This is very different from the step size rules found in standard descent methods, which depend on the current point and search direction.

Convergence results

For constant step size and constant step length, the subgradient algorithm is guaranteed to converge to within some range of the optimal value, i.e.,: $lim_{k oinfty} f_{
m{best^{(k)} - f^*$

Constrained optimization

Projected subgradient

One extension of the subgradient method is the projected subgradient method, which solves the constrained optimization problem:minimize $f(x)$ subject to: $xinmathcal{C}$

where $mathcal{C}$ is a convex set. The projected subgradient method uses the iteration

: $x^{(k+1)} = P left(x^{(k)} - alpha_k g^{(k)}
ight)$

where $P$ is projection on $mathcal{C}$ and $g^{(k)}$ is any subgradient of $f$ at $x^{(k)}.$

General constraints

The subgradient method can be extended to solve the inequality constrained problem

:minimize $f_0(x)$ subject to: $f_i (x) leq 0,quad i = 1,dots,m$

where $f_i$ are convex. The algorithm takes the same form as the unconstrained case

: $x^{(k+1)} = x^{(k)} - alpha_k g^{(k)}$

where $alpha_k>0$ is a step size, and $g^{(k)}$ is a subgradient of the objective or one of the constraint functions at $x.$ Take

: $g^{(k)} = egin{cases} partial f_0 (x) & f_i(x) leq 0,quad i = 1,dots,m \ partial f_j (x) & f_j(x) > 0 end{cases}$

where $partial f$ denotes the subdifferential of $f$ . If the current point is feasible, the algorithm uses an objective subgradient; if the current point is infeasible, the algorithm chooses a subgradient of any violated constraint.

References

* cite book
last = Bertsekas
first = Dimitri P.
authorlink = Dimitri P. Bertsekas
title = Nonlinear Programming
publisher = Athena Scientific
date = 1999
location = Cambridge, MA.
isbn = 1-886529-00-0

* cite book
last = Shor
first = Naum Z.
authorlink = Naum Z. Shor
title = Minimization Methods for Non-differentiable Functions
publisher = Springer-Verlag
isbn = 0-387-12763-1
date = 1985

External links

* [http://www.stanford.edu/class/ee364/ EE364] and [http://www.stanford.edu/class/ee364b/ EE364b] , a Stanford course homepage

Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

Newton's method — In numerical analysis, Newton s method (also known as the Newton–Raphson method), named after Isaac Newton and Joseph Raphson, is a method for finding successively better approximations to the roots (or zeroes) of a real valued function. The… … Wikipedia
Cutting-plane method — In mathematical optimization, the cutting plane method is an umbrella term for optimization methods which iteratively refine a feasible set or objective function by means of linear inequalities, termed cuts. Such procedures are popularly used to… … Wikipedia
Iterative method — In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination… … Wikipedia
Newton's method in optimization — A comparison of gradient descent (green) and Newton s method (red) for minimizing a function (with small step sizes). Newton s method uses curvature information to take a more direct route. In mathematics, Newton s method is an iterative method… … Wikipedia
Nelder–Mead method — Nelder–Mead simplex search over the Rosenbrock banana function (above) and Himmelblau s function (below) See simplex algorithm for Dantzig s algorithm for the problem of linear opti … Wikipedia
Nonlinear conjugate gradient method — In numerical optimization, the nonlinear conjugate gradient method generalizes the conjugate gradient method to nonlinear optimization. For a quadratic function : The minimum of f is obtained when the gradient is 0: . Whereas linear conjugate… … Wikipedia
Ellipsoid method — The ellipsoid method is an algorithm for solving convex optimization problems. It was introduced by Naum Z. Shor, Arkady Nemirovsky, and David B. Yudin in 1972, and used by Leonid Khachiyan to prove the polynomial time solvability of linear… … Wikipedia
Convex optimization — Convex minimization, a subfield of optimization, studies the problem of minimizing convex functions over convex sets. Given a real vector space X together with a convex, real valued function defined on a convex subset of X, the problem is to find … Wikipedia
Naum Z. Shor — Naum Zuselevich Shor Born 1 January 1937(1937 01 01) Kiev, Ukraine, USSR Died 26 February 2006(2006 02 26 … Wikipedia
Mathematical optimization — For other uses, see Optimization (disambiguation). The maximum of a paraboloid (red dot) In mathematics, computational science, or management science, mathematical optimization (alternatively, optimization or mathematical programming) refers to… … Wikipedia

Academic Dictionaries and Encyclopedias

Subgradient method

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Subgradient method

Look at other dictionaries:

Share the article and excerpts

Direct link