Dimension reduction

Dimension reduction: For dimensional reduction in physics, see Dimensional reduction.

In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.

Contents

1 Feature selection

2 Feature extraction

3 See also

4 References

5 External links

Feature selection

Main article: Feature selection

Feature selection approaches try to find a subset of the original variables (also called features or attributes). Two strategies are filter (e.g. information gain) and wrapper (e.g. search guided by the accuracy) approaches. See also combinatorial optimization problems.

In some cases, data analysis such as regression or classification can be done in the reduced space more accurately than in the original space.

Feature extraction

Main article: Feature extraction

Feature extraction transforms the data in the high-dimensional space to a space of fewer dimensions. The data transformation may be linear, as in principal component analysis (PCA), but many nonlinear dimensionality reduction techniques also exist.

The main linear technique for dimensionality reduction, principal component analysis, performs a linear mapping of the data to a lower dimensional space in such a way that the variance of the data in the low-dimensional representation is maximized. In practice, the correlation matrix of the data is constructed and the eigenvectors on this matrix are computed. The eigenvectors that correspond to the largest eigenvalues (the principal components) can now be used to reconstruct a large fraction of the variance of the original data. Moreover, the first few eigenvectors can often be interpreted in terms of the large-scale physical behavior of the system. The original space (with dimension of the number of points) has been reduced (with data loss, but hopefully retaining the most important variance) to the space spanned by a few eigenvectors.

Principal component analysis can be employed in a nonlinear way by means of the kernel trick. The resulting technique is capable of constructing nonlinear mappings that maximize the variance in the data. The resulting technique is entitled Kernel PCA. Other prominent nonlinear techniques include manifold learning techniques such as locally linear embedding (LLE), Hessian LLE, Laplacian eigenmaps, and LTSA. These techniques construct a low-dimensional data representation using a cost function that retains local properties of the data, and can be viewed as defining a graph-based kernel for Kernel PCA. More recently, techniques have been proposed that, instead of defining a fixed kernel, try to learn the kernel using semidefinite programming. The most prominent example of such a technique is maximum variance unfolding (MVU). The central idea of MVU is to exactly preserve all pairwise distances between nearest neighbors (in the inner product space), while maximizing the distances between points that are not nearest neighbors.

An alternative approach to neighborhood preservation is through the minimization of a cost function that measures differences between distances in the input and output spaces. Important examples of such techniques include classical multidimensional scaling (which is identical to PCA), Isomap (which uses geodesic distances in the data space), diffusion maps (which uses diffusion distances in the data space), t-SNE (which minimizes the divergence between distributions over pairs of points), and curvilinear component analysis.

A different approach to nonlinear dimensionality reduction is through the use of autoencoders, a special kind of feed-forward neural networks with a bottle-neck hidden layer. The training of deep encoders is typically performed using a greedy layer-wise pre-training (e.g., using a stack of Restricted Boltzmann machines) that is followed by a finetuning stage based on backpropagation.

See also

Dimension (metadata)

Curse of dimensionality

Nearest neighbor search

Cluster analysis

Feature space

Data mining

Machine learning

Feature selection

Information gain

Chi-square distribution

Recursive feature elimination (SVM based)

Feature extraction

Principal components analysis

Semidefinite embedding

Multifactor dimensionality reduction

Multilinear subspace learning

Nonlinear dimensionality reduction

Isomap

Kernel PCA

Multilinear PCA

Singular value decomposition

Latent semantic analysis

Semantic mapping

Wavelet

Wavelet compression

Haar wavelet

Fourier-related transforms

Fast Fourier transform

Discrete Fourier transform

Discrete cosine transform

Linear least squares

Topological data analysis

Physics

magnetism

Time series

Locality sensitive hashing

Signal subspace

Sufficient dimension reduction

References

Samet, H. (2006) Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann. ISBN 0123694469

External links

A survey of dimension reduction techniques (US DOE Office of Scientific and Technical Information, 2002)

Technical Report on Dimension Reduction (24 pages)

JMLR Special Issue on Variable and Feature Selection

ELastic MAPs

Locally Linear Embedding

A Global Geometric Framework for Nonlinear Dimensionality Reduction

Dimensional reduction at a quantum critical point (realisation of dimensional reduction in a magnet)

Matlab Toolbox for Dimensionality Reduction

kernlab - R package for kernel-based machine learning (includes kernel PCA, and SVM)

DD-HDS homepage

When Is "Nearest Neighbor" Meaningful? - Exploring the effect of dimensionality on nearest neighbor problem

Categories:
Multivariate statistics
Machine learning
Dimension reduction

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

Reduction — Reduction, reduced, or reduce may refer to:cienceChemistry*Reduction – chemical reaction in which atoms have their oxidation number (oxidation state) changed. **Reduced gas – a gas with a low oxidation number **Ore reduction: see… … Wikipedia
Reduction d'endomorphisme — Réduction d endomorphisme En mathématiques, et plus particulièrement en algèbre linéaire, la réduction d endomorphisme est une technique mathématique qui a pour objectif d exprimer des matrices et des endomorphismes sous une forme plus simple,… … Wikipédia en Français
Réduction d'endomorphismes — Réduction d endomorphisme En mathématiques, et plus particulièrement en algèbre linéaire, la réduction d endomorphisme est une technique mathématique qui a pour objectif d exprimer des matrices et des endomorphismes sous une forme plus simple,… … Wikipédia en Français
Réduction des endomorphismes — Réduction d endomorphisme En mathématiques, et plus particulièrement en algèbre linéaire, la réduction d endomorphisme est une technique mathématique qui a pour objectif d exprimer des matrices et des endomorphismes sous une forme plus simple,… … Wikipédia en Français
Reduction de Jordan — Réduction de Jordan Pour les articles homonymes, voir Jordan. La réduction de Jordan est la traduction matricielle de la réduction des endomorphismes introduite par Jordan. Cette réduction est tellement employée, en particulier en analyse pour la … Wikipédia en Français
Réduction de jordan — Pour les articles homonymes, voir Jordan. La réduction de Jordan est la traduction matricielle de la réduction des endomorphismes introduite par Jordan. Cette réduction est tellement employée, en particulier en analyse pour la résolution d… … Wikipédia en Français
Reduction dimensionnelle — Réduction dimensionnelle En physique, une réduction dimensionnelle est une procédure par laquelle, étant donnée une théorie formulée sur un espace temps de dimension , on construit une autre théorie formulée sur un sous espace de dimension . Dans … Wikipédia en Français
Dimension de Hamel — Dimension Voir « dimension » sur le Wiktionnaire … Wikipédia en Français
Dimension spatiale — Dimension Voir « dimension » sur le Wiktionnaire … Wikipédia en Français
Reduction de Gauss — Réduction de Gauss En mathématiques, la réduction de Gauss est un algorithme permettant de représenter toute forme quadratique sur un espace vectoriel de dimension finie (sur un corps commutatif de caractéristique différente de deux) comme une… … Wikipédia en Français

Academic Dictionaries and Encyclopedias

Dimension reduction

Contents

Feature selection

Feature extraction

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Dimension reduction

Contents

Feature selection

Feature extraction

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Direct link