- Chi-square distribution
Probability distribution
name =chi-square
type =density
pdf_
cdf_
parameters =k > 0, degrees of freedom
support =x in [0; +infty),
pdf =frac{(1/2)^{k/2{Gamma(k/2)} x^{k/2 - 1} e^{-x/2},
cdf =frac{gamma(k/2,x/2)}{Gamma(k/2)},
mean =k,
median =approximately k-2/3,
mode =k-2, if kgeq 2,
variance =2,k,
skewness =sqrt{8/k},
kurtosis =12/k,
entropy =frac{k}{2}!+!ln(2Gamma(k/2))!+!(1!-!k/2)psi(k/2)
mgf =1-2,t)^{-k/2} for 2,t<1,
char =1-2,i,t)^{-k/2},In
probability theory andstatistics , the chi-square distribution (also chi-squared or chi^2 distribution) is one of the most widely usedtheoretical probability distribution s ininferential statistics , e.g., instatistical significance tests.Abramowitz_Stegun_ref|26|940] [NIST (2006). [http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm Engineering Statistics Handbook - Chi-Square Distribution] ] [cite book
last = Jonhson
first = N.L.
coauthors = S. Kotz, , N. Balakrishnan
title = Continuous Univariate Distributions (Second Ed., Vol. 1, Chapter 18)
publisher = John Willey and Sons
date = 1994
isbn = 0-471-58495-9] [cite book
last = Mood
first = Alexander
coauthors = Franklin A. Graybill, Duane C. Boes
title = Introduction to the Theory of Statistics (Third Edition, p. 241-246)
publisher = McGraw-Hill
date = 1974
isbn = 0-07-042864-6] It is useful because, under reasonable assumptions, easily calculated quantities can be proven to have distributions that approximate to the chi-square distribution if thenull hypothesis is true.The best-known situations in which the chi-square distribution are used are the common
chi-square test s forgoodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data. Many other statistical tests also lead to a use of this distribution, like Friedman's analysis of variance by ranks.Definition
If X_i are "k" independent, normally distributed random variables with
mean 0 andvariance 1, then the random variable:Q = sum_{i=1}^k X_i^2
is distributed according to the chi-square distribution with k degrees of freedom. This is usually written
:Qsimchi^2_k.,
The chi-square distribution has one parameter: k - a positive integer that specifies the number of degrees of freedom (i.e. the number of X_i)
The chi-square distribution is a special case of the gamma distribution.
Characteristics
Probability density function
A
probability density function of the chi-square distribution is:f(x;k)=egin{cases}displaystylefrac{1}{2^{k/2}Gamma(k/2)},x^{(k/2) - 1} e^{-x/2}& ext{for }x>0,\0& ext{for }xle0,end{cases}
where Gamma denotes the
Gamma function , which has closed-form values at the half-integers.Cumulative distribution function
Its
cumulative distribution function is::F(x;k)=frac{gamma(k/2,x/2)}{Gamma(k/2)} = P(k/2, x/2)
where gamma(k,z) is the lower incomplete Gamma function and P(k, z) is the
regularized Gamma function .Tables of this distribution — usually in its cumulative form — are widely available and the function is included in many
spreadsheet s and all statistical packages.Characteristic function
The characteristic function of the Chi-square distribution is
:chi(t;k)=(1-2it)^{-k/2}.,
Expected value and variance
If Xsimchi^2_k then:mathrm{E}(X)=k :mathrm{Var}(X)=2k
Median
The median of Xsimchi^2_k is given approximately by
:k-frac{2}{3}+frac{4}{27k}-frac{8}{729k^2}.
Information entropy
The
information entropy is given by:H=int_{-infty}^infty f(x;k)ln(f(x;k)) dx=frac{k}{2}+ln left( 2 Gamma left( frac{k}{2} ight) ight)+left(1 - frac{k}{2} ight)psi(k/2).
where psi(x) is the
Digamma function .Related distributions and properties
The chi-square distribution has numerous applications in inferential
statistics , for instance in chi-square tests and in estimatingvariance s.It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role inStudent's t-distribution .It enters allanalysis of variance problems via its role in theF-distribution , which is the distribution of the ratio of two independent chi-squaredrandom variable s divided by their respective degrees of freedom.*If Xsimchi^2_k, then as k tends to infinity, the distribution of X tends to a normal distribution with mean k and variance 2k (convergence is slow as the skewness is sqrt{8/k} and the excess kurtosis is 12/k)
*If Xsimchi^2_k then sqrt{2X} is approximately normally distributed with mean sqrt{2k-1} and unit variance (result credited toR. A. Fisher ).
* If Xsimchi^2_k then sqrt [3] {X/k} is approximately normally distributed with mean 1-2/(9k) and variance 2/(9k) (Wilson and Hilferty,1931)
*X sim mathrm{Exponential}(lambda = frac{1}{2}) is anexponential distribution if X sim chi_2^2 (with 2 degrees of freedom).
*Y sim chi_{ u}^2 is a chi-square distribution if Y = sum_{m=1}^{ u} X_m^2 for X_i sim N(0,1) independent that are normally distributed.
*If oldsymbol{z}'= [Z_1,Z_2,cdots,Z_n] , where the Z_is are independent m{Normal}(0,sigma^2) random variables or oldsymbol{z}sim N_p(oldsymbol{0},sigma^2 mathrm{I}) and oldsymbol{A} is an n imes nidempotent matrix with rank n-k then thequadratic form frac{oldsymbol{z}'oldsymbol{A}oldsymbol{z{sigma^2}sim chi^2_{n-k}.
*If the X_isim N(mu_i,1) have nonzero means, then Y = sum_{m=1}^k X_m^2 is drawn from anoncentral chi-square distribution .
*The chi-square distribution Xsimchi^2_ u is a special case of thegamma distribution , in that X sim {Gamma}(frac{ u}{2}, heta=2).
*Y sim mathrm{F}( u_1, u_2) is anF-distribution if Y = frac{X_1 / u_1}{X_2 / u_2} where X_1 sim chi_{ u_1}^2 and X_2 sim chi_{ u_2}^2 are independent with their respective degrees of freedom.
*Y sim chi^2(ar{ u}) is a chi-square distribution if Y = sum_{m=1}^N X_m where X_m sim chi^2( u_m) are independent and ar{ u} = sum_{m=1}^N u_m.
*if X is chi-square distributed, then sqrt{X} is chi distributed.
*in particular, if X sim chi_2^2 (chi-square with 2 degrees of freedom), then sqrt{X} is Rayleigh distributed.
*if X_1, dots, X_n are i.i.d. N(mu,sigma^2)random variable s, then sum_{i=1}^n(X_i - ar X)^2 sim sigma^2 chi^2_{n-1} where ar X = frac{1}{n} sum_{i=1}^n X_i.
*if X sim mathrm{SkewLogistic}( frac{1}{2}),, then mathrm{log}(1 + e^{-X}) sim chi_2^2,
*The box below shows probability distributions with name starting with chi for some statistics based on X_isim mathrm{Normal}(mu_i,sigma^2_i),i=1,cdots,k, independent random variables:ee also
*
Cochran's theorem
*Inverse-chi-square distribution
*Degrees of freedom (statistics)
*Fisher's method for combining independent tests ofsignificance
*Noncentral chi-square distribution
*Normal distribution
*Normalised Innovation Squared External links
* [http://www.stat.yale.edu/Courses/1997-98/101/chigf.htm Course notes on Chi-Square Goodness of Fit Testing] from
Yale University Stats 101 class. Example includes hypothesis testing and parameter estimation.
* [http://faculty.vassar.edu/lowry/tabs.html#csq On-line calculator for the significance of chi-square] , in Richard Lowry's statistical website atVassar College .
* [http://www.vias.org/simulations/simusoft_distcalc.html Distribution Calculator] Calculates probabilities and critical values for normal, t-, chi2- and F-distribution
* [http://www.stat.sc.edu/~west/applets/chisqdemo.html Chi-Square Calculator for critical values of Chi-Square] in R. Webster West's applet website at University of South Carolina
* [http://graphpad.com/quickcalcs/chisquared2.cfm Chi-Square Calculator from GraphPad]
* [http://www.medcalc.be/manual/chi-square-table.php Table of Chi-squared distribution]References
Wikimedia Foundation. 2010.