- Zipf–Mandelbrot law
Probability distribution
name =Zipf–Mandelbrot
type =mass
pdf_
cdf_
parameters = (integer )
(real )
(real)
support =
pdf =
cdf =
mean =
median =
mode =
variance =
skewness =
kurtosis =
entropy =
mgf =
char =
Inprobability theory andstatistics , the Zipf–Mandelbrot law is a discreteprobability distribution . Also known as thePareto -Zipf law, it is apower-law distribution on ranked data, named after the linguistGeorge Kingsley Zipf who suggested a simpler distribution calledZipf's law , and themathematician Benoît Mandelbrot , who subsequently generalized it.The
probability mass function is given by::
where is given by:
:
which may be thought of as a generalization of a
harmonic number . In the limit as approaches infinity, this becomes theHurwitz zeta function . For finite and the Zipf–Mandelbrot law becomesZipf's law . For infinite and it becomes aZeta distribution .Applications
The distribution of words ranked by their
frequency in a randomtext corpus is generally apower-law distribution, knownasZipf's law .If one plots the
frequency rank of words contained in a largecorpus of text data versus the number of occurrences or actualfrequencies , one obtains apower-law distribution, withexponent close to one (but see Gelbukh and Sidorov 2001).References and links
* Cite book
author = B. Mandelbrot
chapter = Information Theory and Psycholinguistics
title = Scientific psychology
editor= B.B. Wolman and E. Nagel
year = 1965
publisher = Basic Books Reprinted as
** Cite book
author = B. Mandelbrot
chapter = Information Theory and Psycholinguistics
title = Language
editor= R.C. Oldfield and J.C. Marchall
year = 1968
origyear = 1965
publisher = Penguin Books
* [http://arxiv.org/abs/physics/9901035 Z. K. Silagadze: Citations and the Zipf-Mandelbrot's law]
* [http://www.nist.gov/dads/HTML/zipfslaw.html NIST: Zipf's law]
* [http://www.nslij-genetics.org/wli/zipf/index.html W. Li's References on Zipf's law]
* [http://www.gelbukh.com/CV/Publications/2001/CICLing-2001-Zipf.htm Gelbukh and Sidorov 2001: Zipf and Heaps Laws’ Coefficients Depend on Language]
Wikimedia Foundation. 2010.