- Pseudocount
A pseudocount is a count added to observed
data in order to change theprobability in a model of those data, which is known not to be zero, to being negligible rather than being zero.In any observed data set or sample there is the possibility, especially with low-probability events and/or small data sets, of a possible event not occurring. Its observed frequency is therefore 0, implying a probability of 0. This is an oversimplification and is often unhelpful, particularly in probability-based
machine learning techniques such asartificial neural network s andhidden Markov model s.By artificially adjusting the probability of rare (but not impossible) events so those probabilities are not exactly zero, we avoid the zero-frequency problem.The simplest approach is to add "1" to each observed number of events including the zero-count one. This is sometimes called "Laplace's rule" (more formally known as Laplace's
rule of succession ).A more complex approach is to estimate the probability of the events from other factors and adjust accordingly.
ee also
*
Principle of indifference
*prior probability
*offset
*substitution matrix
*n-gram External links
* [http://www.soe.ucsc.edu/research/compbio/html_format_papers/tr-95-11/node8.html Pseudocounts]
** [http://www.soe.ucsc.edu/research/compbio/html_format_papers/tr-95-11/node30.html Bayesian interpretation of pseudocount regularizers]
Wikimedia Foundation. 2010.