- Yule–Simon distribution
Probability distribution
name =Yule–Simon
type =mass
pdf_
Yule–Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
cdf_
Yule–Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
parameters = shape (real)
support =
pdf =
cdf =
mean = for
median =
mode =
variance = for
skewness ={( ho-3); ho}, for
kurtosis = for
entropy =
mgf =
char =
Inprobability andstatistics , the Yule–Simon distribution is adiscrete probability distribution named afterUdny Yule andHerbert Simon . Simon originally called it the "Yule distribution"cite journal
last = Simon
first = H. A.
title = On a class of skew distribution functions
journal = Biometrika
volume = 42
pages = 425–440
date = 1955] .The
probability mass function of the Yule–Simon ("ρ") distribution is:
for
integer and real , where is thebeta function . Equivalently the pmf can be written in terms of the falling factorial as:
where is the
gamma function . Thus, if is an integer,:
The probability mass function "f" has the property that for sufficiently large "k" we have
:
This means that the tail of the Yule–Simon distribution is a realization of
Zipf's law : can be used to model, for example, the relative frequency of the th most frequent word in a large collection of text, which according to Zipf's law isinversely proportional to a (typically small) power of .Occurrence
The Yule–Simon distribution arose originally as the limiting distribution of a particular
stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxacite journal
last = Yule
first = G. U.
title = A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S.
journal = Philosophical Transactions of the Royal Society of London, Ser. B
volume = 213
pages = 21–87
date = 1925] . Simon dubbed this process the "Yule process" but it is more commonly known today as apreferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.The distribution also arises as a continuous mixture of
geometric distribution s. Specifically, assume that follows anexponential distribution with scale or rate :::
Then a Yule–Simon distributed variable has the following geometric distribution:
:
The pmf of a geometric distribution is
:
for . The Yule–Simon pmf is then the following exponential-geometric mixture distribution:
:
Generalizations
The two-parameter generalization of the original Yule distribution replaces the beta function with an
incomplete beta function . The probability mass function of the generalized Yule–Simon("ρ", "α") distribution is defined as:
with . For the ordinary Yule–Simon("ρ") distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
ee also
*
Beta function
*Preferential attachment Bibliography
* Colin Rose and Murray D. Smith, "Mathematical Statistics with Mathematica". New York: Springer, 2002, ISBN 0-387-95234-9. ("See page 107, where it is called the "Yule distribution".")
References
Wikimedia Foundation. 2010.