- Galton–Watson process
The Galton–Watson process is a
stochastic process arising fromFrancis Galton 's statistical investigation of the extinction of surnames.History
There was concern amongst the Victorians that
aristocratic surnames were becoming extinct. Galton originally posed the question regarding the probability of such an event in theEducational Times of1873 , and the ReverendHenry William Watson replied with a solution. Together, they then wrote an1874 paper entitled "On the probability of extinction of families". Galton and Watson appear to have derived their process independently of the earlier work byI. J. Bienaymé ; see Heyde and Seneta 1977. For a detailed history see Kendall (1966 and 1975).Concepts
Assume, as was taken for granted in Galton's time, that surnames are passed on to all male children by their father. Suppose the number of a man's sons to be a
random variable distributed on the set { 0, 1, 2, 3, ...}. Further suppose the numbers of different men's sons to be independent random variables, all having the same distribution.Then the simplest substantial mathematical conclusion is that if the average number of a man's sons is 1 or less, then their surname will surely die out, and if it is more than 1, then there is more than zero probability that it will survive forever.
Modern applications include the survival probabilities for a new
mutant gene, or the initiation of anuclear chain reaction , or the dynamics of disease outbreaks in their first generations of spread, or the chances ofextinction of smallpopulation oforganism s; as well as explaining (perhaps closest to Galton's original interest) why only a handful of males in the deep past of humanity now have "any" surviving male-line descendants, reflected in a rather small number of distinctivehuman Y-chromosome DNA haplogroups .A corollary of high extinction probabilities is that if a lineage "has" survived, it is likely to have experienced, purely by chance, an unusually high growth rate in its early generations at least when compared to the rest of the population.
Mathematical definition
A Galton-Watson process is a stochastic process {"X""n"} which evolves according to the recurrence formula "X"0 = 1 and
:
where for each "n", is a sequence of IID natural number-valued random variables. The extinction probability is given by
:
and is equal to one if "E"{"ξ1"} ≤ 1 and strictly less than one if "E"{"ξ1"} > 1.
The process can be treated analytically using the method of
probability generating function s.If the number of children "ξ j" at each node follows a Poisson distribution, a particularly simple recurrence can be found for the total extinction probability "xn" for a process starting with a single individual at time "n" = 0:
:
giving the curves plotted above.
Bisexual Galton–Watson process
In the (classical) Galton–Watson process defined above, only men count, that is, the reproduction can be understoodas being asexual. The more natural corresponding version for (bi)sexual reproduction is the so-called 'Bisexual Galton–Watson process',where only couples can reproduce.In this process, each child is supposed to be male or female, independently of each other, with a specified probability, and a so-called'mating function' determines how many couples will form in a given generation. As before, reproduction of different couples are considered to be independent of each other. Since the total reproduction within a generation depends now also on the mating function,there exists in general no simple necessary and sufficient for final extinction as it is the case in the classical Galton–Watson process. However, the concept of the 'averaged reproduction mean' (Bruss (1984)) allows for a general and simplesufficient condition for final extinction: If the averaged reproduction mean per couple stays bounded and will not exceed 1for a sufficiently large population size, then the probability of final extinction is always one.
Example
Countries that have used family names for many generations exhibit the Galton–Watson process in their low number of surviving family names:
*Korean name s are the most striking example, with 250 family names, and 45% of the population sharing 3 family names
* Chinese names are similar, with 22% of the population sharing 3 family names (numbering close to 300 million people), and the top 200 names covering 96% of the population.By contrast:
*Dutch name s have only included a family name since theNapoleonic Wars in the early 19th century, and there are over 68,000 Dutch family names.
*Thai name s have only included a family name since 1920, and only a single family can use a given family name, hence there are a great number of Thai names. Further, Thai people change their family names with some frequency, complicating the analysis.ee also
*
Branching process References
* F T Bruss (1984). "A Note on Extinction Criteria for Bisexual Galton–Watson Processes". "
Journal of Applied Probability " 21: 915–919.
* C C Heyde and E Seneta (1977). "I.J. Bienayme: Statistical Theory Anticipated". Berlin, Germany.
* D G Kendall (1966). "Journal of the London Mathematical Society " 41: 385–406
* D G Kendall (1975). "Bulletin of the London Mathematical Society " 7: 225–253
* H W Watson andFrancis Galton , "On the Probability of the Extinction of Families", "Journal of the Anthropological Institute of Great Britain ", volume 4, pages 138–144, 1875.External links
* [http://galton.org/essays/1870-1879/galton-1874-jaigi-family-extinction.pdf The original Galton–Watson paper: On the Probability of the Extinction of Families]
* [http://web.archive.org/web/20040401131411/http://www-users.york.ac.uk/~pml1/stats/gwproc.ps "Survival of a Single Mutant" by Peter M. Lee of the University of York]
Wikimedia Foundation. 2010.