BFGS method

BFGS method

In mathematics, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is a method to solve an unconstrained nonlinear optimization problem.

The BFGS method is derived from the Newton's method in optimization, a class of hill-climbing optimization techniques that seeks the stationary point of a function, where the gradient is 0. Newton's method assumes that the function can be locally approximated as a quadratic in the region around the optimum,and use the first and second derivatives to find the stationary point. In Quasi-Newton methods the Hessian matrix of second derivatives of the function to be minimized does not need to be computed at any stage. The Hessian is updated by analyzing successive gradient vectors instead. Quasi-Newton methods are a generalization of the secant method to find the root of the first derivative for multidimensional problems. In multi-dimensions the secant equation is under-determined, and quasi-Newton methods differ in how they constrain the solution. The BFGS method is one of the most successful members of this class Fact|date=December 2007.

Rationale

The search direction mathbf{p}_k at stage "k" is given by the solution of the analogue of the Newton equation

: B_k mathbf{p}_k = - abla f(mathbf{x}_k).

A line search in the direction mathbf{p}_k is then used to find the next point mathbf{x}_{k+1}.

Instead of requiring the full Hessian matrix at the point mathbf{x}_{k+1} to be computed as B_{k+1}, the approximate Hessian at stage "k" is updated by the addition of two matrices. :B_{k+1}=B_k+U_k+V_kBoth U_k and V_k are rank-one matrices but have different bases. The rank one assumption here means that we may write:C=mathbf{a}mathbf{b}^TSo equivalently, U_k and V_k construct a rank-two update matrix which is robust against the scale problem often suffered in the gradient descent searching.

(as in Broyden's method, the multidimensional analogue of the secant method). The quasi-Newton condition imposed on this update is

:B_{k+1} (mathbf{x}_{k+1}-mathbf{x}_k ) = abla f(mathbf{x}_{k+1}) - abla f(mathbf{x}_k ).

Algorithm

From an initial guess mathbf{x}_0 and an approximate Hessian matrix B_0 the following steps are repeated until x converges to the solution.

# Obtain mathbf{s}_k by solving: B_k mathbf{s}_k = - abla f(mathbf{x}_k).
# Perform a line search to find the optimal alpha_k in the direction found in the first step, then update mathbf{x}_{k+1} = mathbf{x}_k + alpha_kmathbf{s}_k.
# mathbf{y}_k = { abla f(mathbf{x}_{k+1}) - abla f(mathbf{x}_k)}.
# B_{k+1} = B_k + frac{mathbf{y}_k mathbf{y}_k^{ op{mathbf{y}_k^{ op} mathbf{s}_k} - frac{B_k mathbf{s}_k (B_k mathbf{s}_k)^{ op{mathbf{s}_k^{ op} B_k mathbf{s}_k}.

f(mathbf{x}) denotes the objective function to be minimized. Convergence can be checked by observing the norm of the gradient, left| abla f(mathbf{x}_k) ight|. Practically, B_0 can be initialized with B_0 = I, so that the first step will be equivalent to a gradient descent, but further steps are more and more refined by B_{k}, the approximation to the Hessian.

The first step of the algorithm is carried out using an approximate inverse of the matrix B_k, which is usually obtained efficiently by applying the Sherman–Morrison formula to the fourth line of the algorithm, giving

: B_{k+1}^{-1} = B_k^{-1} + frac{(mathbf{s}_k mathbf{s}_k^{ op}) (mathbf{s}_k^{ op}mathbf{y}_k+mathbf{y}_k^{ op} B_k^{-1} mathbf{y}_k)}{(mathbf{s}_k^{ op} mathbf{y}_k)^2} - frac{B_k^{-1} mathbf{y}_k mathbf{s}_k^{ op} + mathbf{s}_k mathbf{y}_k^{ op}B_k^{-1{mathbf{s}_k^{ op} mathbf{y}_k}.

Credible intervals or confidence intervals for the solution can be obtained from the inverse of the final Hessian matrix.

ee also

* L-BFGS method
* Newton's method in optimization

Bibliography

* Broyden, C. G., "The Convergence of a Class of Double-rank Minimization Algorithms" ,"Journal of the Institute of Mathematics and Its Applications" 1970, "6", 76-90
* Fletcher, R., " A New Approach to Variable Metric Algorithms", "Computer Journal" 1970, "13", 317-322
* Goldfarb, D., "A Family of Variable Metric Updates Derived by Variational Means", "Mathematics of Computation" 1970, "24", 23-26
* Shanno, D. F.,"Conditioning of Quasi-Newton Methods for Function Minimization" , "Mathematics of Computation" 1970, "24", 647-656
* Avriel, Mordecai 2003. "Nonlinear Programming: Analysis and Methods." Dover Publishing. ISBN 0-486-43227-0.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • BFGS — En mathématiques, la méthode de Broyden Fletcher Goldfarb Shanno (BFGS) est une méthode permettant de résoudre un problème d optimisation non linéaire sans contraintes. La méthode BFGS est souvent implémentée comme un algorithme à directions de… …   Wikipédia en Français

  • L-BFGS — and L BFGS B are software packages for solving nonlinear optimization problems. They are designed for large scale applications in which the Hessian matrix is not available or is expensive to compute. To accelerate convergence, the two codes… …   Wikipedia

  • Nonlinear conjugate gradient method — In numerical optimization, the nonlinear conjugate gradient method generalizes the conjugate gradient method to nonlinear optimization. For a quadratic function : The minimum of f is obtained when the gradient is 0: . Whereas linear conjugate… …   Wikipedia

  • Nelder–Mead method — Nelder–Mead simplex search over the Rosenbrock banana function (above) and Himmelblau s function (below) See simplex algorithm for Dantzig s algorithm for the problem of linear opti …   Wikipedia

  • Quasi-Newton method — In optimization, quasi Newton methods (also known as variable metric methods) are well known algorithms for finding local maxima and minima of functions. Quasi Newton methods are based on Newton s method to find the stationary point of a function …   Wikipedia

  • Nelder-Mead method — See simplex algorithm for the numerical solution of the linear programming problem. The Nelder Mead method or downhill simplex method or amoeba method is a commonly used nonlinear optimization algorithm. It is due to John Nelder R. Mead (1965)… …   Wikipedia

  • L-BFGS — y L BFGS B son dos métodos de optimización quasi Newton de funciones con un gran número de parámetros o de una gran complejidad. Se trata de un método que hace un uso limitado de la memoria (usa mucha menos memoria que otros algoritmos para el… …   Wikipedia Español

  • Newton's method — In numerical analysis, Newton s method (also known as the Newton–Raphson method), named after Isaac Newton and Joseph Raphson, is a method for finding successively better approximations to the roots (or zeroes) of a real valued function. The… …   Wikipedia

  • Iterative method — In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination… …   Wikipedia

  • Newton's method in optimization — A comparison of gradient descent (green) and Newton s method (red) for minimizing a function (with small step sizes). Newton s method uses curvature information to take a more direct route. In mathematics, Newton s method is an iterative method… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”