- Generative model
In
statistics , a generative model is a model for randomly generating observed data, typically given some hidden parameters. It specifies a joint probability distribution over observation and label sequences. Generative models are used inmachine learning for either modeling data directly (i.e., modeling observed draws from aprobability density function ), or as an intermediate step to forming a conditional probability density function. A conditional distribution can be formed from a generative model through the use ofBayes' rule .Stated another way: if one has a generative model and a dictionary for a language one can "generate" well-formed sentences of that language. This is done by selecting a word from the dictionary and then applying a rule from the model, which results in a two-word phrase. Applying the rules from the model again produce a three-word phrase, etc. The resultant sentence will be a valid construction of the language, even though no human may have ever spoken it. In
A Mathematical Theory of Communication Shannon gives an example in which a table of frequencies of English word pairs is used to generate a sentence beginning with "representing and speedily is an good"; which is not proper English but which will increasingly approximate it as the table is moved from word pairs to word triplets etc.Generative models contrast with
discriminative model s, in that a generative model is a full probability model of all variables, whereas a discriminative model provides a model only of the target variable(s) conditional on the observed variables. Thus a generative model can be used, for example, to simulate (i.e. "generate") values of any variable in the model, whereas a discriminative model allows only sampling of the target variables conditional on the observed quantities.Examples of generative models include:
*Gaussian distribution
*Gaussian mixture model
*Multinomial distribution
*Hidden Markov model
*Naive Bayes
*Latent Dirichlet allocation If the observed data are truly sampled from the generative model, then fitting the parameters of the generative model to maximize the data likelihood is a common method. However, since most statistical models are only approximates to the "true" distribution, if the model's application is to inference about a subset of variables conditional on known values of others, then it can be argued that the approximation makes more assumptions than are necessary to solve the problem at hand. In such cases, it is often more accurate to model the conditional density functions directly: i.e., performing classification or
regression analysis .External links
Wikimedia Foundation. 2010.