- Smoothing spline
The smoothing spline is a method of
smoothing , or fitting a smooth curve to a set of noisy observations.Definition
Let be a sequence of observations, modeled by the relation . The smoothing spline estimate of the function is defined to be the minimizer (over the class of twice differentiable functions) of [cite book|title=Generalized Additive Models|last=Hastie|first=T. J.|coauthors=Tibshirani, R. J.|year=1990|publisher=Chapman and Hall|isbn=0-412-34390-8] :
Remarks:
# is a smoothing parameter, controlling the trade-off between fidelity to the data and roughness of the function estimate.
# The integral is evaluated over the range of the .
# As (no smoothing), the smoothing spline converges to the interpolating spline.
# As (infinite smoothing), the roughness penalty becomes paramount and the estimate converges to a linear least-squares estimate.
# The roughness penalty based on the second derivative is the most common in modern statistics literature, although the method can easily be adapted to penalties based on other derivatives.
# In early literature, with equally-spaced , second or third-order differences were used in the penalty, rather than derivatives.
# When the sum-of-squares term is replaced by a log-likelihood, the resulting estimate is termed "penalized likelihood". The smoothing spline is the special case of penalized likelihood resulting from a Gaussian likelihood.Derivation of the smoothing spline
It is useful to think of fitting a smoothing spline in two steps:
# First, derive the values .
# From these values, derive for all "x".Now, treat the second step first.
Given the vector of fitted values, the sum-of-squares part of the spline criterion is fixed. It remains only to minimize , and the minimizer is a natural cubic spline that interpolates the points . This interpolating spline is a linear operator, and can be written in the form:where are a set of spline basis functions. As a result, the roughness penalty has the form:where the elements of "A" are . The basis functions, and hence the matrix "A", depend on the configuration of the predictor variables , but not on the responses or .
Now back the first step. The penalized sum-of-squares can be written as:where .Minimizing over gives:
Related methods
Smoothing splines are related to, but distinct from:
* Regression splines. In this method, the data is fitted to a set of spline basis functions with a reduced set of knots, typically by least squares. No roughness penalty is used.
* Penalized Splines. This combines the reduced knots of regression splines, with the roughness penalty of smoothing splines. [cite book|title=Semiparametric Regression|last=Ruppert|first=David|coauthors=Wand, M. P. and Carroll, R. J.|publisher=Cambridge University Press|year=2003|isbn=0-521-78050-0]Further reading
* Wahba, G. (1990). "Spline Models for Observational Data". SIAM, Philadelphia.
* Green, P. J. and Silverman, B. W. (1994). "Nonparametric Regression and Generalized Linear Models". CRC Press.References
Wikimedia Foundation. 2010.