Variational Bayesian methods

Variational Bayesian methods

Variational Bayesian methods, also called ensemble learning, are a family of techniques for approximating intractable integrals arising in Bayesian statistics and machine learning. They can be used to lower bound the marginal likelihood (i.e. "evidence") of several models with a view to performing model selection, and often provide an analytical approximation to the parameter posterior which is useful for prediction. It is an alternative to Monte Carlo sampling methods for making use of a posterior distribution that is difficult to sample from directly.

Mathematical derivation

In variational inference, the posterior distribution over a set of latent variables X = {X_1 dots X_n} given some data D is approximatedby a variational distribution

:P(X|D) approx Q(X).

The variational distribution Q(X) is restricted to belong to a family of distributions of simplerform than P(X|D). This family is selected with the intention that Q can be made very similarto the true posterior. The difference between Q and this true posterior is measured in terms ofa dissimilarity function d(Q; P) and hence inference is performed by selecting the distributionQ that minimises d. One choice of dissimilarity function where this minimisation is tractableis the Kullback-Leibler divergence (KL divergence), defined as

: ext{KL}(Q || P) = sum_X Q(X) log frac{Q(X)}{P(X|D)}.

We can write the log evidence as

:

As the log evidence is fixed with respect to Q, maximising the final term mathcal{L}(Q) will minimise the KL divergence between Q and P. By appropriate choice of Q, we can make mathcal{L}(Q) tractable to compute and to maximise. Hence we have both a lower bound on the evidence mathcal{L}(Q) and an analytical approximation to the posterior Q.

ee also

* Variational message passing: a modular algorithm for variational Bayesian inference.
* Expectation-maximization algorithm: a related approach which corresponds to a special case of variational Bayesian inference.

External links

* [http://www.variational-bayes.org Variational-Bayes.org] - a repository of papers, software, and links related to the use of variational Bayesian methods.
* [http://www.inference.phy.cam.ac.uk/mackay/itila/ The on-line textbook: Information Theory, Inference, and Learning Algorithms] , by David J.C. MacKay provides an introduction to variational methods (p. 422).
* [http://www.cse.buffalo.edu/faculty/mbeal/thesis/index.html Variational Algorithms for Approximate Bayesian Inference] , by M. J. Beal includes comparisons of EM to Variational Bayesian EM and derivations of several models including Variational Bayesian HMMs.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Bayesian probability — Bayesian statistics Theory Bayesian probability Probability interpretations Bayes theorem Bayes rule · Bayes factor Bayesian inference Bayesian network Prior · Posterior · Likelihood …   Wikipedia

  • Bayesian inference — is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. The name Bayesian comes from the frequent use of Bayes theorem in the inference process. Bayes theorem… …   Wikipedia

  • Variational message passing — (VMP) is an approximate inference technique for continuous valued Bayesian networks, with conjugate exponential parents, developed by John Winn in his PhD thesis. John Winn is now at Microsoft Research. VMP was developed as a means of… …   Wikipedia

  • Bayesian network — A Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG). For example …   Wikipedia

  • Monte Carlo methods for electron transport — The Monte Carlo method for electron transport is a semiclassical Monte Carlo(MC) approach of modeling semiconductor transport. Assuming the carrier motion consists of free flights interrupted by scattering mechanisms, a computer is utilized to… …   Wikipedia

  • One-shot learning — is an object categorization problem of current research interest in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, one shot… …   Wikipedia

  • List of mathematics articles (V) — NOTOC Vac Vacuous truth Vague topology Valence of average numbers Valentin Vornicu Validity (statistics) Valuation (algebra) Valuation (logic) Valuation (mathematics) Valuation (measure theory) Valuation of options Valuation ring Valuative… …   Wikipedia

  • Expectation-maximization algorithm — An expectation maximization (EM) algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. EM alternates between performing an… …   Wikipedia

  • Байесовская вероятность — Байесовская вероятность  это интерпретация понятия вероятности, используемое в байесовской теории. Вероятность определяется как степень уверенности в истинности суждения. Для определения степени уверенности в истинности суждения при… …   Википедия

  • Latent Dirichlet allocation — In statistics, latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups which explain why some parts of the data are similar. For example, if observations are words collected… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”