Natural exponential family

In probability and statistics, the natural exponential family (NEF) is a class of probability distributions that is a special case of an exponential family (EF). Many common distributions are members of a natural exponential family, and the use of such distributions simplifies the theory and computation of generalized linear models.

1 Definition
2 Examples
3 Properties
- 3.1 Univariant case
- 3.2 Multivariate case
4 Natural exponential families with quadratic variance functions (NEF-QVF)
- 4.1 The six NEF-QVFs
- 4.2 Properties of NEF-QVF
5 References

Definition

Probability distribution function (PDF) of the univariate case (scalar domain, scalar parameter)

The natural exponential family (NEF) is a subset of the exponential family. NEF is an exponential family in which the natural parameter η and the natural statistic T(x) are both the identity. A distribution in the exponential family with parameter θ can be written with probability density function (PDF)

$f_X(x| \theta) = h(x)\ \exp\Big(\ \eta(\theta) T(x) - A(\theta)\ \Big) \,\! ,$

where $h (x)$ and $A (θ)$ are known functions. A distribution in the natural exponential family with parameter θ can thus be written with PDF

$f_X(x| \theta) = h(x)\ \exp\Big(\ \theta x - A(\theta)\ \Big) \,\! .$

[Note that slightly different notation is used by the originator of the NEF, Carl Morris.^[1] Morris uses ω instead of η and ψ instead of A.]

Probability distribution function (PDF) of the general case (multivariate domain and/or parameter)

Suppose that $\mathbf{x} \in \mathcal{X} \subseteq \mathbb{R}^p$ , then a natural exponential family of order p has density or mass function of the form:

$f_X(\mathbf{x}|\boldsymbol\theta) = h(\mathbf{x})\ \exp\Big(\boldsymbol\theta^\top \mathbf{x} - A(\boldsymbol\theta)\ \Big) \,\! ,$

where in this case the parameter $\boldsymbol\theta \in \mathbb{R}^p .$

Moment and cumulant generating function

A member of a natural exponential family has moment generating function (MGF) of the form

$M_X(\mathbf{t}) = \exp\Big(\ A(\boldsymbol\theta + \mathbf{t}) - A(\boldsymbol\theta)\ \Big) \, .$

The cumulant generating function is by definition the logarithm of the MGF, so it is

$K_X(\mathbf{t}) = A(\boldsymbol\theta + \mathbf{t}) - A(\boldsymbol\theta) \, .$

Examples

The five most important univariate cases are:

normal distribution with known variance
Poisson distribution
gamma distribution with known shape parameter α (or k depending on notation set used)
binomial distribution with known number of trials, n
negative binomial distribution with known $r$

These five examples – Poisson, binomial, negative binomial, normal, and gamma – are a special subset of NEF, called NEF with quadratic variance function (NEF-QVF) because the variance can be written as a quadratic function of the mean. NEF-QVF are discussed below.

Distributions such as the exponential, chi-squared, Rayleigh, Weibull, Bernoulli, and geometric distributions are special cases of the above five distributions. Many common distributions are either NEF or can be related to the NEF. For example: the chi-squared distribution is a special case of the gamma distribution. The Bernoulli distribution is a binomial distribution with n = 1 trial. The exponential distribution is a gamma distribution with shape parameter α = 1 (or k = 1 ). The Rayleigh and Weibull distributions can each be written in terms of an exponential distribution.

Some exponential family distributions are not NEF. The lognormal and Beta distribution are in the exponential family, but not the natural exponential family.

The parameterization of most of the above distributions has been written differently than the parameterization commonly used in textbooks and the above linked pages. For example, the above parameterization differs from the parameterization in the linked article in the Poisson case. The two parameterizations are related by $θ = log(λ)$ , where λ is the mean parameter, and so that the density may be written as

$f(k;\theta) = \frac{1}{k!} \exp\Big(\ \theta\ k - \exp(\theta)\ \Big) \$ ,

for $\theta \in \mathbb{R}$ , so

$h(k) = \frac{1}{k!} \$ , and $A(\theta) = \exp(\theta)\$ .

This alternate parameterization can greatly simplify calculations in mathematical statistics. For example, in Bayesian inference, a posterior probability distribution is calculated as the product of two distributions. Normally this calculation requires writing out the probability distribution functions (PDF) and integrating; with the above parameterization, however, that calculation can be avoided. Instead, relationships between distributions can be abstracted due to the below properties of the NEF.^[2].

An example of the multivariate case is the multinomial distribution with known number of trials.

Properties

The properties of the natural exponential family can be used to simplify calculations involving these distributions.

Univariant case

1. The cumulants of an NEF can be calculated as derivatives of the NEF's cumulant generating function. The nth cumulant is the nth derivative with respect to θ of the cumulant generating function.

The cumulant generating function is

$K_X(t) = A(\theta + t) - A(\theta) \,$

The first cumulant is

$K_1 = \frac{d}{d\theta} A(t) \,$

The mean is the first moment and always equal to the first cumulant, so

$\mu_1' = \kappa_1 = \mathrm{E}[X] = K'_X(0) = A'(\theta)\,$

The variance is always the second moment, and it is always related to the first and second cumulants by

$\mathrm{Var}[X] = \mu_2' = \kappa_2 + \kappa_1^2 \,$

$\mathrm{Var}[X] = K''_X(0) = A''(\theta) \,$

The nth cumulant is

$K_n = \frac{d^{(n)}}{d\theta^{(n)}} A(t) \,$

2. Natural exponential families (NEF) are closed under convolution.

Given independent identically distributed (iid) $X_1,\ldots,X_n$ with distribution from an NEF, then $\sum_{i=1}^n X_i\,$ is an NEF, although not necessarily the original NEF. This follows from the properties of the cumulant generating function.

3. The variance function for random variables with an NEF distribution can be written in terms of the mean.

V a r (X) = V (μ)

4. The first two moments of a NEF distribution uniquely characterize the distribution.

X \sim N E F [μ, V (μ)]

Multivariate case

In the multivariate case, the mean vector and covariance matrix are thus:

$\mathrm{E}[X] = \nabla A(\boldsymbol\theta)\,$ and $\mathrm{Cov}[X] = \nabla \nabla^\top A(\boldsymbol\theta)\,$

where $\nabla$ is the gradient and $\nabla \nabla^\top$ is the Hessian.

Natural exponential families with quadratic variance functions (NEF-QVF)

A special case of the natural exponential families are those with quadratic variance functions. Six NEFs have quadratic variance functions (QVF) in which the variance of the distribution can be written as a quadratic function of the mean. These are called NEF-QVF. The properties of these distributions were first described by Carl Morris.^[3]

V a r (X) = V (μ) = ν 0 + ν 1 μ + ν 2 μ 2 .

The six NEF-QVFs

The six NEF-QVF are written here in increasing complexity of the relationship between variance and mean.

1. The normal distribution with fixed variance $X \,\sim N(\mu, \sigma^2)$ is NEF-QVF because the variance is constant. The variance can be written $V a r (X) = V (μ) = σ 2$ , so variance is a degree 0 function of the mean.

2. The Poisson distribution $X \,\sim Pois (\mu)$ is NEF-QVF because all Poisson distributions have variance equal to the mean $V a r (X) = V (μ) = μ$ , so variance is a linear function of the mean.

3. The Gamma distribution $X \,\sim Gam(r, \lambda)$ is NEF-QVF because the mean of the Gamma distribution is $μ = r λ$ and the variance of the Gamma distribution is $V a r (X) = V (μ) = μ 2 / r$ , so the variance is a quadratic function of the mean.

4. The binomial distribution $X \,\sim Bin(n, p)$ is NEF-QVF because the mean is $μ = n p$ and the variance is $V a r (X) = n p (1 - p)$ which can be written in terms of the mean as $V (X) = - n p 2 + n p = - μ 2 / n + μ.$

5. The negative binomial distribution $X \sim N e g B i n (n, p)$ is NEF-QVF because the mean is $μ = n p / (1 - p)$ and the variance is $V (μ) = μ 2 / n + μ.$

6. The (not very famous) distribution generated by the generalized hyperbolic secant distribution (NEF-GHS) has $V (μ) = μ 2 / n + n$ and $μ > 0.$

Properties of NEF-QVF

The properties of NEF-QVF can simplify calculations that use these distributions.

1. Natural exponential families with quadratic variance functions (NEF-QVF) are closed under convolutions of a linear transformation. That is, a convolution of a linear transformation of an NEF-QVF is also an NEF-QVF, although not necessarily the original one.

Given independent identically distributed (iid) $X_1,\ldots,X_n$ with distribution from a NEF-QVF. A convolution of a linear transformation of an NEF-QVF is also an NEF-QVF.

Let $Y = \sum_{i=1}^n (X_i - b)/c \,$ be the convolution of a linear transformation of X. The mean of Y is $\mu* = n(\mu - b)/c \,$ . The variance of Y can be written in terms of the variance function of the original NEF-QVF. If the original NEF-QVF had variance function

V a r (X) = V (μ) = ν 0 + ν 1 μ + ν 2 μ 2,

then the new NEF-QVF has variance function

V a r (Y) = V * (μ * ) = ν * 0 + ν * 1 μ + ν * 2 μ 2,

where

$\nu*_0 = nV(b)/c^2 \, ,$

$\nu*_1 = V'(b)/c \, ,$

$\nu*_2/n = \nu_2/n \, .$

2. Let $X 1$ and $X 2$ be independent NEF with the same parameter θ and let $Y = X 1 + X 2$ . Then the conditional distribution of $X 1$ given Y $f (X 1 | Y)$ has quadratic variance in Y if and only if $X 1$ and $X 2$ are NEF-QVF. Examples of conditional distributions $f (X 1 | Y)$ are the normal, binomial, beta, hypergeometric and geometric distributions, which are not all NEF-QVF.^[1]

3. NEF-QVF have conjugate prior distributions on μ in the Pearson system of distributions (also called the Pearson distribution although the Pearson system of distributions is actually a family of distributions rather than a single distribution.) Examples of conjugate prior distributions of NEF-QVF distributions are the normal, gamma, reciprocal gamma, beta, F-, and t- distributions. Again, these conjugate priors are not all NEF-QVF.^[1]

4. If $X | μ$ has an NEF-QVF distribution and μ has a conjugate prior distribution then the marginal distributions are well-known distributions.^[1]

These properties together with the above notation can simplify calculations in mathematical statistics that would normally be done using complicated calculations and calculus. This method is the subject of a forthcoming graduate mathematical statistics book by Carl Morris and Joe Blitzstein, who have been teaching the graduate probability and mathematical statistics course using the manuscript since 2006--07.^[2]

References

^ ^a ^b ^c ^d Morris C. (2006) "Natural exponential families", Encyclopedia of Statistical Sciences.
^ ^a ^b Course notes, Statistics 210 211, Harvard University, 2006 2007. http://www.registrar.fas.harvard.edu/Courses/Statistics.html
^ Morris C. (1982) "Natural exponential families with quadratic variance functions". Ann. Statist., 10(1), 65–80.

Morris C. (1982) Natural exponential families with quadratic variance functions: statistical theory. Dept of mathematics, Institute of Statistics, University of Texas, Austin.

Probability distributions

Discrete univariate with finite support

Benford · Bernoulli · Beta-binomial · binomial · categorical · hypergeometric · Poisson binomial · Rademacher · discrete uniform · Zipf · Zipf-Mandelbrot

Discrete univariate with infinite support

beta negative binomial · Boltzmann · Conway–Maxwell–Poisson · discrete phase-type · extended negative binomial · Gauss–Kuzmin · geometric · logarithmic · negative binomial · parabolic fractal · Poisson · Skellam · Yule–Simon · zeta

Continuous univariate supported on a bounded interval, e.g. [0,1]

Arcsine · ARGUS · Balding-Nichols · Bates · Beta · Noncentral beta · Irwin–Hall · Kumaraswamy · logit-normal · raised cosine · triangular · U-quadratic · uniform · Wigner semicircle

Continuous univariate supported on a semi-infinite interval, usually [0,∞)

Benini · Benktander 1st kind · Benktander 2nd kind · Beta prime · Bose–Einstein · Burr · chi-squared · chi · Coxian · Dagum · Davis · Erlang · exponential · F · Fermi–Dirac · folded normal · Fréchet · Gamma · generalized inverse Gaussian · half-logistic · half-normal · Hotelling's T-squared · hyper-exponential · hypoexponential · inverse chi-squared (scaled-inverse-chi-squared) · inverse Gaussian · inverse gamma · Kolmogorov · Lévy · log-Cauchy · log-Laplace · log-logistic · log-normal · Maxwell–Boltzmann · Maxwell speed · Mittag–Leffler · Nakagami · noncentral chi-squared · Pareto · phase-type · Rayleigh · relativistic Breit–Wigner · Rice · Rosin–Rammler · shifted Gompertz · truncated normal · type-2 Gumbel · Weibull · Wilks' lambda

Continuous univariate supported on the whole real line (−∞, ∞)

Cauchy · exponential power · Fisher's z · generalized normal · generalized hyperbolic · geometric stable · Gumbel · Holtsmark · hyperbolic secant · Landau · Laplace · Linnik · logistic · noncentral t · normal (Gaussian) · normal-inverse Gaussian · skew normal · slash · stable · Student's t · type-1 Gumbel · variance-gamma · Voigt

Continuous univariate with support whose type varies

generalized extreme value · generalized Pareto · Tukey lambda · q-Gaussian · q-exponential · shifted log-logistic

Mixed continuous-discrete univariate distributions

rectified Gaussian

Multivariate (joint)

Discrete: Ewens · multinomial · multivariate Pólya · negative multinomial Continuous: Dirichlet · Generalized Dirichlet · multivariate normal · Multivariate stable · multivariate Student · normal-scaled inverse gamma · normal-gamma Matrix-valued: inverse-Wishart · matrix normal · Wishart

Directional

Univariate (circular) directional: Circular uniform · univariate von Mises · wrapped normal · wrapped Cauchy · wrapped exponential · wrapped Lévy Bivariate (spherical): Kent Bivariate (toroidal): bivariate von Mises Multivariate: von Mises–Fisher · Bingham

Degenerate and singular

Degenerate: discrete degenerate · Dirac delta function Singular: Cantor

Families

Circular · compound Poisson · elliptical · exponential · natural exponential · location-scale · maximum entropy · mixture · Pearson · Tweedie · wrapped

Categories:

Exponentials
Theory of probability distributions
Systems of probability distributions

Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

Exponential family — Not to be confused with the exponential distribution. Natural parameter links here. For the usage of this term in differential geometry, see differential geometry of curves. In probability and statistics, an exponential family is an important… … Wikipedia
Exponential distribution — Not to be confused with the exponential families of probability distributions. Exponential Probability density function Cumulative distribution function para … Wikipedia
Natural evolution strategy — Natural evolution strategies (NES) are a family of numerical optimization algorithms for black box problems. Similar in spirit to evolution strategies, they iteratively update the (continuous) parameters of a search distribution by following the… … Wikipedia
Natural experiment — A natural experiment is an observational study in which the assignment of treatments to subjects has been haphazard: That is, the assignment of treatments has been made by nature , but not by experimenters. Thus, a natural experiment is not a… … Wikipedia
List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… … Wikipedia
List of mathematics articles (N) — NOTOC N N body problem N category N category number N connected space N dimensional sequential move puzzles N dimensional space N huge cardinal N jet N Mahlo cardinal N monoid N player game N set N skeleton N sphere N! conjecture Nabla symbol… … Wikipedia
Cross-entropy method — The cross entropy (CE) method attributed to Reuven Rubinstein is a general Monte Carlo approach to combinatorial and continuous multi extremal optimization and importance sampling. The method originated from the field of rare event simulation,… … Wikipedia
Nef — or NEF can refer to: Contents 1 Historical 2 People 3 Places 4 … Wikipedia
Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function … Wikipedia
Compound probability distribution — In probability theory, a compound probability distribution is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution F with an unknown parameter θ that is… … Wikipedia

Academic Dictionaries and Encyclopedias

Natural exponential family

Contents

Definition

Probability distribution function (PDF) of the univariate case (scalar domain, scalar parameter)