# Quasi-Newton method

Quasi-Newton method

In optimization, quasi-Newton methods (also known as variable metric methods) are well-known algorithms for finding local maxima and minima of functions. Quasi-Newton methods are based on
Newton's method to find the stationary point of a function, where the gradient is 0. Newton's method assumes that the function can be locally approximated as a quadratic in the region around the optimum, and use the first and second derivatives (gradient and Hessian) to find the stationary point. In Quasi-Newton methods the Hessian matrix of second derivatives of the function to be minimized does not need to be computed. The Hessian is updated by analyzing successive gradient vectors instead.
Quasi-Newton methods are a generalization of the secant method to find the root of the first derivative for multidimensional problems. In multi-dimensions the secant equation is under-determined, and quasi-Newton methodsdiffer in how they constrain the solution, typically by adding a simple low-rank update to the current estimate of the Hessian.

The first quasi-Newton algorithm was proposed by W.C. Davidon, a physicist working at Argonne National Laboratory. He developed the first quasi-Newton algorithm in 1959: the DFP updating formula, which was later popularized by Fletcher and Powell in 1963, but is rarely used today. The most common quasi-Newton algorithms are currently the SR1 formula (for symmetric rank one) and the widespread BFGS method, that was suggested independently by Broyden, Fletcher, Goldfarb, and Shanno, in 1970. The Broyden's class is a linear combination of the DFP and BFGS methods.

The SR1 formula does not guarantee the update matrix to maintain positive-definiteness and can be used for indefinite problems. The Broyden's method does not require the update matrix to be symmetric and it is used to find the root of a general system of equations (rather than the gradient) by updating the Jacobian (rather than the Hessian).

Description of the method

As in the Newton's method, one uses the second order approximation to find the minimum of a function $f\left(x\right)$.The Taylor series of $f\left(x\right)$ is:::$f\left(x_0+Delta x\right)=f\left(x_0\right)+ abla f\left(x_0\right)^T Delta x+frac\left\{1\right\}\left\{2\right\} Delta x^T \left\{B\right\} Delta x$,where ($abla f$) is the gradient and $B$ the Hessian matrix.The Taylor series of the gradient itself:::$abla f\left(x_0+Delta x\right)= abla f\left(x_0\right)+B Delta x$,is called "secant equation". Solving for $abla f\left(x_0+Delta x_0\right)=0$ provides the Newton step:::$Delta x_0=-B^\left\{-1\right\} abla f\left(x_0\right)$,but $B$ is unknown. In one dimension, solving for $B$ and applying the Newton's step with the updated value is equivalent to the secant method. In multidimensions $B$ is under determined. Various methods are used to find the solution to the secant equation that is symmetric ($B^T=B$) and closest to the current approximate value $B_k$ according to some metric $min_B ||B-B_k||$. An approximate initial value of $B_0=I$ is often sufficient to achieverapid convergence. The unknown $x_k$ is updated applying the Newton's step calculatedusing the current approximate Hessian matrix $B_\left\{k\right\}$
* $Delta x_k=- alpha_k B_k^\left\{-1\right\} abla f\left(x_k\right)$, with $alpha$ chosen to satisfy the Wolfe conditions;
* $x_\left\{k+1\right\}=x_\left\{k\right\}+Delta x_k$;
*The gradient computed at the new point $abla f\left(x_\left\{k+1\right\}\right)$, and:$y_k= abla f\left(x_\left\{k+1\right\}\right)- abla f\left(x_\left\{k\right\}\right)$,
*is used to update the Hessian $displaystyle B_\left\{k+1\right\}$, or directly its inverse $displaystyle H_\left\{k+1\right\}=B_\left\{k+1\right\}^\left\{-1\right\}$ using the Sherman-Morrison formula.

The most popular update formulas are:

ee also

* Newton's method in optimization
* Newton's method
* DFP updating formula
* BFGS method
* SR1 formula
* Broyden's Method

References

* Eventually W.C. Davidon's paper published. William C. Davidon, [http://link.aip.org/link/?SJE/1/1/1 Variable Metric Method for Minimization] , SIOPT Volume 1 Issue 1, Pages 1-17, 1991.
* Nocedal, Jorge & Wright, Stephen J. (1999). Numerical Optimization. Springer-Verlag. ISBN 0-387-98793-2.
* Edwin K.P.Chong and Stanislaw H.Zak, An Introduction to Optimization 2ed, John Wiley & Sons Pte. Ltd. August 2001.

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Quasi-Newton-Verfahren — sind eine Klasse von numerischen Verfahren zur Lösung nichtlinearer Minimierungsprobleme. Die Verfahren basieren auf dem Newton Verfahren, berechnen die Inverse der Hesse Matrix jedoch nicht direkt, sondern nähern sie lediglich an, um den… …   Deutsch Wikipedia

• Newton's method in optimization — A comparison of gradient descent (green) and Newton s method (red) for minimizing a function (with small step sizes). Newton s method uses curvature information to take a more direct route. In mathematics, Newton s method is an iterative method… …   Wikipedia

• Newton's method — In numerical analysis, Newton s method (also known as the Newton–Raphson method), named after Isaac Newton and Joseph Raphson, is a method for finding successively better approximations to the roots (or zeroes) of a real valued function. The… …   Wikipedia

• Quasi-likelihood — In statistics, quasi likelihood estimation is one way of allowing for overdispersion, that is, greater variability in the data than would be expected from the statistical model used. It is most often used with models for count data or grouped… …   Wikipedia

• Gauss–Newton algorithm — The Gauss–Newton algorithm is a method used to solve non linear least squares problems. It can be seen as a modification of Newton s method for finding a minimum of a function. Unlike Newton s method, the Gauss–Newton algorithm can only be used… …   Wikipedia

• Broyden's method — In mathematics, Broyden s method is a quasi Newton method for the numerical solution of nonlinear equations in more than one variable. It was originally described by C. G. Broyden in 1965. [cite journal last = Broyden first = C. G. title = A… …   Wikipedia

• Newton's theorem of revolving orbits — Figure 1: An attractive force F(r) causes the blue planet to move on the cyan circle. The green planet moves three times faster and thus requires a stronger centripetal force, which is supplied by adding an attractive inverse cube force. The …   Wikipedia

• Method of moments (statistics) — See method of moments (probability theory) for an account of a technique for proving convergence in distribution. In statistics, the method of moments is a method of estimation of population parameters such as mean, variance, median, etc. (which… …   Wikipedia

• BFGS method — In mathematics, the Broyden Fletcher Goldfarb Shanno (BFGS) method is a method to solve an unconstrained nonlinear optimization problem. The BFGS method is derived from the Newton s method in optimization, a class of hill climbing optimization… …   Wikipedia

• Nonlinear conjugate gradient method — In numerical optimization, the nonlinear conjugate gradient method generalizes the conjugate gradient method to nonlinear optimization. For a quadratic function : The minimum of f is obtained when the gradient is 0: . Whereas linear conjugate… …   Wikipedia