- Variational message passing
Variational message passing (VMP) is an approximate
inference technique for continuous-valuedBayesian networks , withconjugate-exponential parents, developed by John Winn in his PhD thesis. John Winn is now atMicrosoft Research . VMP was developed as a means of generalizing the approximatevariational methods used by such techniques asLatent Dirichlet allocation and works by updating an approximate distribution at each node through messages in the node'sMarkov blanket .Likelihood Lower Bound
Given some set of hidden variables and observed variables , the goal of approximate inference is to lower-bound the probability that a graphical model is in the configuration . Over some probability distribution (to be defined later),
:.
So, if we define our lower bound to be
:,
then the likelihood is simply this bound plus the
relative entropy between and . Because the relative entropy is non-negative, the function defined above is indeed a lower bound of the log likelihood of our observation . The distribution will have a simpler character than that of because marginalizing over is intractable for all but the simplest ofgraphical models . In particular, VMP uses a factorized distribution ::
where is a disjoint part of the graphical model.
Determining the Update Rule
The likelihood estimate needs to be as large as possible; because it's a lower bound, getting closer improves the approximation of the log likelihood. By substituting in the factorized version of , , parameterized over the hidden nodes as above, is simply the negative
relative entropy between and plus other terms independent of if is defined as as:,
where is the expectation over all distributions except . Thus, if we set to be , the bound is maximized.
Messages in Variational Message Passing
Parents send their children the expectation of their
sufficient statistic while children send their parents theirnatural parameter , which also requires messages to be sent from the co-parents of the node.Relationship to Exponential Families
Because all nodes in VMP come from exponential families and all parents of nodes are conjugate to their children nodes, the expectation of the
sufficient statistic can be computed from thenormalization factor .VMP Algorithm
The algorithm begins by computing the expected value of the sufficient statistics for that vector. Then, until the likelihood converges to a stable value (this is usually accomplished by setting a small threshold value and running the algorithm until it increases by less than that threshold value), do the following at each node:
- Get all messages from parents
- Get all messages from children (this might require the children to get messages from the co-parents)
- Compute the expected value of the nodes sufficient statistics
Constraints
Because every child must be conjugate to its parent, this limits the types of distributions that can be used in the model. For example, the parents of a
Gaussian distribution must be aGaussian distribution (corresponding to theMean ) and agamma distribution (corresponding to the precision, or one over in more common paramaterizations). Discrete variables can have Dirichlet parents, and Poisson and exponential nodes must have gamma parents. However, if the data can be modeled in this manner, VMP offers a generalized framework for providing inference.External links
* An [http://vibes.sourceforge.net/ implementation] of VMP with usage examples
References
* M. J. Beal (2003). [http://www.cs.toronto.edu/~beal/thesis/beal03.pdf Variational Algorithms for Approximate Bayesian Inference] . PhD Thesis, [http://www.gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit] , University College London, 2003.
* J. M. Winn and C. Bishop (2005). [http://research.microsoft.com/users/Cambridge/jwinn/papers/VMP2005.pdf Variational Message Passing] . Journal of Machine Learning Research 6: 661-694.
Wikimedia Foundation. 2010.