- Wishart distribution
Probability distribution

name =Wishart

type =density

pdf_

cdf_

parameters =$n\; >\; 0!$ deg. of freedom (real)

$mathbf\{V\}\; >\; 0,$ scale matrix ( pos. def)

support =$mathbf\{W\}!$ is positive definite

pdf =$frac\{left|mathbf\{W\}\; ight|^frac\{n-p-1\}\{2$ {2^frac{np}{2}left|{mathbf V} ight|^frac{n}{2}Gamma_p(frac{n}{2})} expleft(-frac{1}{2}{ m Tr}({mathbf V}^{-1}mathbf{W}) ight)

cdf =

mean =$n\; mathbf\{V\}$

median =

mode =$(n-p-1)mathbf\{V\}\; ext\{\; for\; \}n\; geq\; p+1$

variance =$n(v\_\{ij\}^2+v\_\{ii\}v\_\{jj\})$

skewness =

kurtosis =

entropy =

mgf =

char =$Theta\; mapsto\; left|\{mathbf\; I\}\; -\; 2i,\{mathbfTheta\}\{mathbf\; V\}\; ight|^\{-n/2\}$In

statistics , the**Wishart distribution**, named in honor of John Wishart, is a generalization to multiple dimensions of thechi-square distribution , or, in the case of non-integer degrees of freedom, of thegamma distribution . It is any of a family ofprobability distribution s for nonnegative-definite matrix-valuedrandom variable s ("random matrices"). These distributions are of great importance in theestimation of covariance matrices inmultivariate statistics .**Definition**Suppose "X" is an "n" × "p" matrix, each row of which is independently drawn from "p"-variate normal distribution with zero mean:

:$X\_\{(i)\}\{=\}(x\_i^1,dots,x\_i^p)^Tsim\; N\_p(0,V),$

Then the Wishart distribution is the

probability distribution of the "p"×"p" random matrix:$S=X^T\; X\; =\; sum\_\{i\; =\; 1\}^\{n\}\; X\_\{(i)\}\; X\_\{(i)\}^T,\; ,!$

known as the

scatter matrix . One indicates that "S" has that probability distributionby writing:$Ssim\; W\_p(V,n).$

The positive integer "n" is the number of "degrees of freedom". Sometimes this is written "W"("V", "p", "n").

If "p" = 1 and "V" = 1 then this distribution is a

chi-square distribution with "n" degrees of freedom.**Occurrence**The Wishart distribution arises frequently in

likelihood-ratio test s in multivariate statistical analysis.It also arises in the spectral theory of random matrices.**Probability density function**The Wishart distribution can be characterized by its

probability density function , as follows.Let

**W**be a "p" × "p" symmetric matrix of random variables that ispositive definite . Let**V**be a (fixed) positive definite matrix of size "p" × "p".Then, if "n" ≥ "p", then

**W**has a Wishart distribution with "n" degrees of freedom if it has aprobability density function "f"_{W}given by:$f\_\{mathbf\; W\}(w)=frac\{\; left|w\; ight|^\{(n-p-1)/2\}\; expleft\; [\; -\; \{\; m\; trace\}(\{mathbf\; V\}^\{-1\}w/2\; )\; ight]\; \}\{2^\{np/2\}left|\{mathbf\; V\}\; ight|^\{n/2\}Gamma\_p(n/2)\}$

where Γ

_{"p"}(·) is themultivariate gamma function defined as:$Gamma\_p(n/2)=pi^\{p(p-1)/4\}Pi\_\{j=1\}^pGammaleft\; [\; (n+1-j)/2\; ight]\; .$

In fact the above definition can be extended to any real "n" > "p" − 1.

**Characteristic function**The characteristic function of the Wishart distribution is

:$Theta\; mapsto\; left|\{mathbf\; I\}\; -\; 2i,\{mathbfTheta\}\{mathbf\; V\}\; ight|^\{-n/2\}.$

In other words,

:$Theta\; mapsto\; \{mathcal\; E\}left\{mathrm\{exp\}left\; [icdotmathrm\{trace\}(\{mathbf\; W\}\{mathbfTheta\})\; ight]\; ight\}=left|\{mathbf\; I\}\; -\; 2i\{mathbfTheta\}\{mathbf\; V\}\; ight|^\{-n/2\}$

where $\{mathcal\; E\}(cdot)$ denotes expectation.

(here $Theta$ and $\{mathbf\; I\}$ are matrices the same size as $\{mathbf\; V\}$ ($\{mathbf\; I\}$ is the

identity matrix ); and $i$ is the square root of minus one).**Theorem**If $scriptstyle\; \{mathbf\; W\}$ has a Wishart distribution with "m" degrees of freedom and variance matrix $scriptstyle\; \{mathbf\; V\}$—write $scriptstyle\; \{mathbf\; W\}sim\{mathbf\; W\}\_p(\{mathbf\; V\},m)$—and $scriptstyle\{mathbf\; C\}$ is a "q" × "p" matrix of rank "q", then

:$\{mathbf\; C\}\{mathbf\; W\}\{mathbf\; C\text{'}\}sim\{mathbf\; W\}\_qleft(\{mathbf\; C\}\{mathbf\; V\}\{mathbf\; C\text{'}\},m\; ight).$

**Corollary 1**If $\{mathbf\; z\}$ is a nonzero $p\; imes\; 1$ constant vector, then$\{mathbf\; z\text{'}\}\{mathbf\; W\}\{mathbf\; z\}simsigma\_z^2chi\_m^2$.

In this case, $chi\_m^2$ isthe

chi-square distribution and $sigma\_z^2=\{mathbf\; z\text{'}\}\{mathbf\; V\}\{mathbf\; z\}$ (note that $sigma\_z^2$ is a constant; it is positive because $\{mathbf\; V\}$ is positive definite).**Corollary 2**Consider the case where $\{mathbf\; z\text{'}\}=(0,ldots,0,1,0,ldots,0)$ (that is, the "j"th element is one and all others zero). Then corollary 1 above shows that

:$w\_\{jj\}simsigma\_\{jj\}chi^2\_m$

gives the marginal distribution of each of the elements on the matrix's diagonal.

Noted statistician

George Seber points out that the Wishart distribution is not called the "multivariate chi-square distribution" because the marginal distribution of the off-diagonal elements is not chi-square. Seber prefers to reserve the term multivariate for the case when all univariate marginals belong to the same family.**Estimator of the multivariate normal distribution**The Wishart distribution is the

probability distribution of the maximum-likelihood estimator (MLE) of thecovariance matrix of amultivariate normal distribution . The derivation of the MLE is perhaps surprisingly subtle and elegant. It involves thespectral theorem and the reason why it can be better to view a scalar as the trace of a 1×1 matrix than as a mere scalar. Seeestimation of covariance matrices .**Drawing values from the distribution**The following procedure is due to Smith & Hocking [

*http://www.jstor.org/pss/2346290*] . One can sample random "p" × "p" matrices from a "p"-variate Wishart distribution with scale matrix $\{\; extbf\; V\}$ and "n" degrees of freedom (for $n\; geq\; p$) as follows:# Generate a random "p" × "p" lower

triangular matrix $\{\; extbf\; A\}$ such that:

#* $a\_\{ii\}=(chi^2\_\{n-i+1\})^\{1/2\}$, i.e. $a\_\{ii\}$ is the square root of a sample taken from achi-square distribution $chi^2\_\{n-i+1\}$

#* $a\_\{ij\}$, for $jmath>,\; is\; sampled\; from\; a\; standardnormal\; distribution$ N\_1(0,1)$$

# Compute theCholesky decomposition of $\{\; extbf\; V\}\; =\; \{\; extbf\; L\}\{\; extbf\; L\}^T$.

# Compute the matrix $\{\; extbf\; X\}\; =\; \{\; extbf\; L\}\{\; extbf\; A\}\{\; extbf\; A\}^T\{\; extbf\; L\}^T$. At this point, $\{\; extbf\; X\}$ is a sample from the Wishart distribution $W\_p(\{\; extbf\; V\},n)$.Note that if $\{\; extbf\; V\}=\{\; extbf\; I\}$, the identity matrix, then the sample can be directly obtained from $\{\; extbf\; X\}\; =\; \{\; extbf\; A\}\{\; extbf\; A\}^T$ since the Cholesky decomposition of $\{\; extbf\; V\}=\{\; extbf\; I\}\{\; extbf\; I\}^T$.

**ee also***

Estimation of covariance matrices

*Hotelling's T-square distribution

*Inverse-Wishart distribution

*Wikimedia Foundation.
2010.*