- Kernel trick
In
machine learning , the kernel trick is a method for using alinear classifier algorithm to solve a non-linear problem by mapping the original non-linear observations into a higher-dimensional space, where the linear classifier is subsequently used; this makes a linear classification in the new space equivalent to non-linear classification in the original space.This is done using
Mercer's theorem , which states that any continuous, symmetric,positive semi-definite kernel function "K"("x", "y") can be expressed as adot product in a high-dimensionalspace .More specifically, if the arguments to the kernel are in a
measurable space "X", and if the kernel ispositive semi-definite — i.e.:
for any finite subset {"x"1, ..., "x"n} of "X" and subset {"c"1, ..., "c"n} of objects (typically real numbers) — then there exists a function φ("x") whose range is in an
inner product space of possibly high dimension, such that:
The kernel trick transforms any algorithm that solely depends on the dot product between two vectors. Wherever a dot product is used, it is replaced with the kernel function. Thus, a linear algorithm can easily be transformed into a non-linear algorithm. This non-linear algorithm is equivalent to the linear algorithm operating in the range space of φ. However, because kernels are used, the φ function is never explicitly computed. This is desirable, because the high-dimensional space may be infinite-dimensional (as is the case when the kernel is a Gaussian).
The kernel trick was first published by Aizerman et al. [cite journal | author = M. Aizerman, E. Braverman, and L. Rozonoer | year = 1964 | title = Theoretical foundations of the potential function method in pattern recognition learning | journal = Automation and Remote Control | volume = 25 | pages = 821–837]
It has been applied to several kinds of algorithm in
machine learning andstatistics , including:
*Perceptron s
*Support vector machine s
*Principal components analysis
*Canonical correlation analysis
* Fisher'slinear discriminant analysis
* ClusteringThe origin of the term "kernel trick" is not known.fact|date=March 2008
References
ee also
*
Kernel methods
*Integral transforms
*Hilbert space , specificallyreproducing kernel Hilbert space
*Mercer kernel
Wikimedia Foundation. 2010.