- Bayesian model comparison
A common problem in
statistical inference is to use data to decide between two or more competing models. Frequentist statistics useshypothesis test s for this purpose. There are severalBayesian approaches. One approach is throughBayes factor s.The posterior probability of a model given data, Pr(H|D), is given by
Bayes' theorem ::Pr(H|D) = frac{Pr(D|H)Pr(H)}{Pr(D)}
The key data-dependent term Pr(H|D) is a likelihood, and is sometimes called the evidence for model "H"; evaluating it correctly is the key to Bayesian model comparison. The evidence is usually the
normalizing constant orpartition function of another inference, namely the inference of the parameters of model "H" given the data "D".The plausibility of two different models "H"1 and "H"2, parametrised by model parameter vectors heta_1 and heta_2 is assessed by the
Bayes factor given by:frac{Pr(D|H_2)}{Pr(D|H_1)} = frac{int Pr( heta_2|H_2)Pr(D| heta_2,H_2),d heta_2}{int Pr( heta_1|H_1)Pr(D| heta_1,H_1),d heta_1}
Thus the Bayesian model comparison does not depend on the parameters used by each model. Instead, it considers the probability of the model considering all possible parameter values. Alternatively, the
Maximum likelihood estimate could be used for each of the parameters.An advantage of the use of
Bayes factor s is that it automatically, and quite naturally, includes a penalty for including too much model structure. It thus guards againstoverfitting .Other approaches are:
* to treat model comparison as a decision problem, computing the expected value or cost of each model choice;
* to useMinimum Message Length (MML).See also
*
Nested sampling algorithm
*Akaike information criterion
*Schwarz'sBayesian information criterion
*Conditional predictive ordinate
*Deviance information criterion
*Wallace'sMinimum Message Length (MML)
*Model selection References
* Gelman, A., Carlin, J.,Stern, H. and Rubin, D. Bayesian Data Analysis. Chapman and Hall/CRC.(1995)
* Bernardo, J., and Smith, A.F.M., Bayesian Theory. John Wiley. (1994)
* Lee, P.M. Bayesian Statistics. Arnold.(1989).
* Denison, D.G.T., Holmes, C.C., Mallick, B.K., Smith, A.F.M., Bayesian Methods for Nonlinear Classification and Regression. John Wiley. (2002).
* Richard O. Duda, Peter E. Hart, David G. Stork (2000) "Pattern classification" (2nd edition), Section 9.6.5, p. 487-489, Wiley, ISBN 0-471-05669-3
* Chapter 24 in [http://omega.math.albany.edu:8008/JaynesBook.html Probability Theory - The logic of science] by E. T. Jaynes, 1994.
*David J.C. MacKay (2003) Information theory, inference and learning algorithms, CUP, ISBN 0-521-64298-1, (also [http://www.inference.phy.cam.ac.uk/mackay/itila/book.html available online] )External links
* [http://www.inference.phy.cam.ac.uk/mackay/itila/ The on-line textbook: Information Theory, Inference, and Learning Algorithms] , by
David J.C. MacKay , discusses Bayesian model comparison in Chapter 28, p343.
Wikimedia Foundation. 2010.