- Multivariate Polya distribution
The multivariate Pólya distribution, also called the Dirichlet compound multinomial distribution, is a
compound probability distribution , where a probability vector p is drawn from aDirichlet distribution with parameter vector , and a set of discrete samples x is drawn from themultinomial distribution with probability vector p. The compounding corresponds to a Polya urn scheme. In document classification, for example, the distribution is used to represent probabilities over word counts for different document types.The probability of a vector of counts x given the parameter vector is obtained by integrating out the parameters p of the multinomial distribution:
which results in the following explicit formula:
where is the
gamma function , and is the number of times the outcome in x was .The two-dimensional version of the multivariate Pólya distribution is known as the
Beta-binomial model .The multivariate Pólya distribution is used in automated document classification and clustering,
genetics ,economy , combat modeling, and quantitative marketing.ee also
*
Beta-binomial model
*Chinese restaurant process
*Dirichlet process
*Generalized Dirichlet distribution
*George Pólya
*Urn problem References
*Elkan, C. (2006) [http://www.icml2006.org/icml_documents/camera-ready/037_Clustering_Documents.pdf Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution] . ICML, 289-296
*Kvam, P. and Day, D. (2001) The multivariate Polya distribution in combat modeling. Naval Research Logistics, 48, 1-17
*Madsen, RE., Kauchak, D. and Elkan, C. (2005) [http://www.cse.ucsd.edu/~dkauchak/kauchak05modeling.pdf Modeling Word Burstiness Using the Dirichlet Distribution] . ICML, 545-552
*Minka, T. (2003) [http://research.microsoft.com/~minka/papers/dirichlet/ Estimating a Dirichlet distribution] . Technical report Microsoft Research. Includes Matlab code for fitting distributions to data.
*Wagner, U. and Taudes, A. (1986) A Multivariate Polya Model of Brand Choice and Purchase Incidence. Marketing Science, 5(3), 219-244.
Wikimedia Foundation. 2010.