Yule–Simon distribution

Yule–Simon distribution

Probability distribution
name =Yule–Simon
type =mass
pdf_

Yule–Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
cdf_

Yule–Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
parameters = ho>0, shape (real)
support =k in {1,2,dots},
pdf = ho,mathrm{B}(k, ho+1),
cdf =1 - k,mathrm{B}(k, ho+1),
mean =frac{ ho}{ ho-1}, for ho>1,
median =
mode =1,
variance =frac{ ho^2}{( ho-1)^2;( ho-2)}, for ho>2,
skewness =frac{( ho+1)^2;sqrt{ ho-2
{( ho-3); ho}, for ho>3,
kurtosis = ho+3+frac{11 ho^3-49 ho-22} {( ho-4);( ho-3); ho}, for ho>4,
entropy =
mgf =frac{ ho}{ ho+1};{}_2F_1(1,1; ho+2; e^t),e^t ,
char =frac{ ho}{ ho+1};{}_2F_1(1,1; ho+2; e^{i,t}),e^{i,t} ,
In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert Simon. Simon originally called it the "Yule distribution"cite journal
last = Simon
first = H. A.
title = On a class of skew distribution functions
journal = Biometrika
volume = 42
pages = 425–440
date = 1955
] .

The probability mass function of the Yule–Simon ("ρ") distribution is

:f(k; ho) = ho,mathrm{B}(k, ho+1), ,

for integer k geq 1 and real ho > 0, where mathrm{B} is the beta function. Equivalently the pmf can be written in terms of the falling factorial as

: f(k; ho) = frac{ ho,Gamma( ho+1)}{(k+ ho)^{underline{ ho+1} ,,

where Gamma is the gamma function. Thus, if ho is an integer,

: f(k; ho) = frac{ ho, ho!,(k-1)!}{(k+ ho)!} .,

The probability mass function "f" has the property that for sufficiently large "k" we have

: f(k; ho) approx frac{ ho,Gamma( ho+1)}{k^{ ho+1 propto frac{1}{k^{ ho+1 .,

This means that the tail of the Yule–Simon distribution is a realization of Zipf's law: f(k; ho) can be used to model, for example, the relative frequency of the kth most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of k.

Occurrence

The Yule–Simon distribution arose originally as the limiting distribution of a particular stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxacite journal
last = Yule
first = G. U.
title = A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S.
journal = Philosophical Transactions of the Royal Society of London, Ser. B
volume = 213
pages = 21–87
date = 1925
] . Simon dubbed this process the "Yule process" but it is more commonly known today as a preferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.

The distribution also arises as a continuous mixture of geometric distributions. Specifically, assume that W follows an exponential distribution with scale 1/ ho or rate ho:

:W sim mathrm{Exponential}( ho),:h(w; ho) = ho , exp(- ho,w),

Then a Yule–Simon distributed variable K has the following geometric distribution:

:K sim mathrm{Geometric}(exp(-W)),

The pmf of a geometric distribution is

:g(k; p) = p , (1-p)^{k-1},

for kin{1,2,dots}. The Yule–Simon pmf is then the following exponential-geometric mixture distribution:

:f(k; ho) = int_0^{infty} ,,, g(k;exp(-w)),h(w; ho),dw,

Generalizations

The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule–Simon("ρ", "α") distribution is defined as

: f(k; ho,alpha) = frac{ ho}{1-alpha^{ ho ; mathrm{B}_{1-alpha}(k, ho+1) , ,

with 0 leq alpha < 1. For alpha = 0 the ordinary Yule–Simon("&rho;") distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.

ee also

* Beta function
* Preferential attachment

Bibliography

* Colin Rose and Murray D. Smith, "Mathematical Statistics with Mathematica". New York: Springer, 2002, ISBN 0-387-95234-9. ("See page 107, where it is called the "Yule distribution".")

References


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Udny Yule — George Udny Yule Pour les articles homonymes, voir Yule (homonymie). George Udny Yule (18 février 1871 – 26 juin 1951) était un statisticien écossais. Ses contributions théoriques et pratiques furent importantes dans le cadre de la corrélation et …   Wikipédia en Français

  • George Udny Yule — Pour les articles homonymes, voir Yule (homonymie). George Udny Yule (18 février 1871 – 26 juin 1951) était un statisticien écossais. Ses contributions théoriques et pratiques furent importantes dans le cadre de la corrélation et de la régression …   Wikipédia en Français

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

  • Chi-squared distribution — This article is about the mathematics of the chi squared distribution. For its uses in statistics, see chi squared test. For the music group, see Chi2 (band). Probability density function Cumulative distribution function …   Wikipedia

  • Cauchy distribution — Not to be confused with Lorenz curve. Cauchy–Lorentz Probability density function The purple curve is the standard Cauchy distribution Cumulative distribution function …   Wikipedia

  • Maxwell–Boltzmann distribution — Maxwell–Boltzmann Probability density function Cumulative distribution function parameters …   Wikipedia

  • Probability distribution — This article is about probability distribution. For generalized functions in mathematical analysis, see Distribution (mathematics). For other uses, see Distribution (disambiguation). In probability theory, a probability mass, probability density …   Wikipedia

  • Negative binomial distribution — Probability mass function The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation. notation: parameters: r > 0 number of failures until the experiment is stopped (integer,… …   Wikipedia

  • Exponential distribution — Not to be confused with the exponential families of probability distributions. Exponential Probability density function Cumulative distribution function para …   Wikipedia

  • Multivariate normal distribution — MVN redirects here. For the airport with that IATA code, see Mount Vernon Airport. Probability density function Many samples from a multivariate (bivariate) Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”