Factorial code

Factorial code

Most real world data sets consist of data vectors whose individual components are not statistically independent, that is, they are redundant in the statistical sense. Then it is desirable to create a factorial code of the data, i. e., a new vector-valued representation of each data vector such that it gets uniquely encoded by the resulting code vector (loss-free coding), but the code components are statistically independent.

Later supervised learning usually works much better when the raw input data is first translated into such a factorial code. For example, suppose the final goal is to classify images with highly redundant pixels. A naive Bayes classifier will assume the pixels are statistically independent random variables and therefore fail to produce good results. If the data are first encoded in a factorial way, however, then the naive Bayes classifier will achieve its optimal performance (compare Schmidhuber et al. 1996).

To create factorial codes, Horace Barlow and co-workers suggested to minimize the sum of the bit entropies of the code components of binary codes (1989). Jürgen Schmidhuber (1992) re-formulated the problem in terms of predictors and binary feature detectors, each receiving the raw data as an input. For each detector there is a predictor that sees the other detectors and learns to predict the output of its own detector in response to the various input vectors or images. But each detector uses a machine learning algorithm to become as unpredictable as possible. The global optimum of this objective function corresponds to a factorial code represented in a distributed fashion across the outputs of the feature detectors.

See also

* Blind signal separation (BSS)
* Principal component analysis (PCA)
* Factor analysis
* Unsupervised learning
* Image processing
* Signal processing

References

* Horace Barlow, T. P. Kaushal, and G. J. Mitchison. Finding minimum entropy codes. Neural Computation, 1:412-423, 1989.

* Jürgen Schmidhuber. Learning factorial codes by predictability minimization. Neural Computation, 4(6):863-879, 1992

* J. Schmidhuber and M. Eldracher and B. Foltin. Semilinear predictability minimzation produces well-known feature detectors. Neural Computation, 8(4):773-786, 1996


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Factorial — n n! 0 1 1 1 2 2 3 6 4 24 5 120 6 720 7 …   Wikipedia

  • List of mathematics articles (F) — NOTOC F F₄ F algebra F coalgebra F distribution F divergence Fσ set F space F test F theory F. and M. Riesz theorem F1 Score Faà di Bruno s formula Face (geometry) Face configuration Face diagonal Facet (mathematics) Facetting… …   Wikipedia

  • Jürgen Schmidhuber — (born 1963 in Munich) is a computer scientist and artist known for his work on machine learning, universal Artificial Intelligence (AI), artificial neural networks, digital physics, and low complexity art. His contributions also include… …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Independent component analysis — (ICA) is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non Gaussian source signals. It is a special case of blind source separation. Definition When… …   Wikipedia

  • Principal component analysis — PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by… …   Wikipedia

  • Principal components analysis — Principal component analysis (PCA) is a vector space transform often used to reduce multidimensional data sets to lower dimensions for analysis. Depending on the field of application, it is also named the discrete Karhunen Loève transform (KLT),… …   Wikipedia

  • Horace Barlow — Professor Horace Basil Barlow FRS (born December 8, 1921) is a British visual neuroscientist.In 1953 Barlow discovered that the frog brain has neurons which fire in response to specific visual stimuli. This was a precursor to the work of Hubel… …   Wikipedia

  • Blind signal separation — Blind signal separation, also known as blind source separation, is the separation of a set of signals from a set of mixed signals, without the aid of information (or with very little information) about the source signals or the mixing… …   Wikipedia

  • Smalltalk — This article is about the programming language. For other uses, see Small talk (disambiguation). Smalltalk Smalltalk 80: The Language and its Implementation, a.k.a. the Blue book , a seminal book on the language Paradigm(s) object oriented… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”