- Quasi-Newton method
In optimization, quasi-Newton methods (also known as variable metric methods) are well-known algorithms for finding local
maxima and minima of functions. Quasi-Newton methods are based on
Newton's method to find thestationary point of a function, where thegradient is 0. Newton's method assumes that the function can be locally approximated as a quadratic in the region around the optimum, and use the first and second derivatives (gradient and Hessian) to find the stationary point. In Quasi-Newton methods theHessian matrix of secondderivatives of the function to be minimized does not need to be computed. The Hessian is updated by analyzing successive gradient vectors instead.Quasi-Newton methods are a generalization of thesecant method to find the root of the first derivative for multidimensional problems. In multi-dimensions the secant equation is under-determined, and quasi-Newton methodsdiffer in how they constrain the solution, typically by adding a simple low-rank update to the current estimate of the Hessian.The first quasi-Newton algorithm was proposed by W.C. Davidon, a physicist working at
Argonne National Laboratory . He developed the first quasi-Newton algorithm in 1959: theDFP updating formula , which was later popularized by Fletcher and Powell in 1963, but is rarely used today. The most common quasi-Newton algorithms are currently theSR1 formula (for symmetric rank one) and the widespreadBFGS method , that was suggested independently by Broyden, Fletcher, Goldfarb, and Shanno, in 1970. The Broyden's class is a linear combination of the DFP and BFGS methods.The SR1 formula does not guarantee the update matrix to maintain positive-definiteness and can be used for indefinite problems. The
Broyden's method does not require the update matrix to be symmetric and it is used to find the root of a general system of equations (rather than the gradient) by updating theJacobian (rather than the Hessian).Description of the method
As in the Newton's method, one uses the second order approximation to find the minimum of a function .The
Taylor series of is:::,where () is thegradient and theHessian matrix .TheTaylor series of the gradient itself:::,is called "secant equation". Solving for provides the Newton step:::,but is unknown. In one dimension, solving for and applying the Newton's step with the updated value is equivalent to thesecant method . In multidimensions isunder determined . Various methods are used to find the solution to the secant equation that is symmetric () and closest to the current approximate value according to some metric . An approximate initial value of is often sufficient to achieverapid convergence. The unknown is updated applying the Newton's step calculatedusing the current approximate Hessian matrix
* , with chosen to satisfy theWolfe conditions ;
* ;
*The gradient computed at the new point , and:,
*is used to update the Hessian , or directly its inverse using theSherman-Morrison formula .The most popular update formulas are:
ee also
*
Newton's method in optimization
*Newton's method
*DFP updating formula
*BFGS method
*SR1 formula
*Broyden's Method References
* Eventually W.C. Davidon's paper published. William C. Davidon, [http://link.aip.org/link/?SJE/1/1/1 Variable Metric Method for Minimization] , SIOPT Volume 1 Issue 1, Pages 1-17, 1991.
* Nocedal, Jorge & Wright, Stephen J. (1999). Numerical Optimization. Springer-Verlag. ISBN 0-387-98793-2.
* Edwin K.P.Chong and Stanislaw H.Zak, An Introduction to Optimization 2ed, John Wiley & Sons Pte. Ltd. August 2001.
Wikimedia Foundation. 2010.