Hotelling's T-square distribution

Hotelling's T-square distribution

In statistics, Hotelling's T-square statistic, [ H. Hotelling (1931) "The generalization of Student's ratio", Ann. Math. Statist., Vol. 2, pp 360–378.] named for Harold Hotelling,is a generalization of Student's t statistic that is used in multivariate hypothesis testing.

Hotelling's T-square statistic is defined as

:t^2=n({mathbf x}-{mathbfmu})'{mathbf W}^{-1}({mathbf x}-{mathbfmu})

where "n" is a number of points (see below) {mathbf x} is a column vector of p elements and {mathbf W} is a p imes p sample covariance matrix.

If xsim N_p(mu,{mathbf V}) is a random variable with a multivariate Gaussian distribution and {mathbf W}sim W_p(m,{mathbf V}) (independent of "x") has a Wishart distribution with the same non-singular variance matrix mathbf V and with m=n-1,then the distributionof t^2 is T^2(p,m), Hotelling's T-square distribution with parameters "p" and "m".It can be shown that

:frac{m-p+1}{pm}T^2sim F_{p,m-p+1}where F is the F-distribution.

Now suppose that

:{mathbf x}_1,dots,{mathbf x}_n

are "p"×1 column vectors whose entries are real numbers. Let

:overline{mathbf x}=(mathbf{x}_1+cdots+mathbf{x}_n)/n

be their mean. Let the "p"×"p" positive-definite matrix

:{mathbf W}=sum_{i=1}^n (mathbf{x}_i-overline{mathbf x})(mathbf{x}_i-overline{mathbf x})'/(n-1)

be their "sample variance" matrix. (The transpose of any matrix "M" is denoted above by "M"′). Let μ be some known "p"×1 column vector (in applications a hypothesized value of a population mean). Then Hotelling's T-square statistic is

:t^2=n(overline{mathbf x}-{mathbfmu})'{mathbf W}^{-1}(overline{mathbf x}-{mathbfmu}).

Note that t^2 is closely related to the squared Mahalanobis distance.

In particular, it can be shownK.V. Mardia, J.T. Kent, and J.M. Bibby (1979) "Multivariate Analysis", Academic Press.] that if {mathbf x}_1,dots,{mathbf x}_nsim N_p(mu,{mathbf V}), are independent, and overline{mathbf x} and {mathbf W} are as defined above then {mathbf W} has a Wishart distribution with "n" − 1 degrees of freedom

:mathbf{W} sim W_p(V,n-1)

and is independent of overline{mathbf x}, and

:overline{mathbf x}sim N_p(mu,V/n).

This implies that:

:t^2 = n(overline{mathbf x}-{mathbfmu})'{mathbf W}^{-1}(overline{mathbf x}-{mathbfmu}) sim T^2(p, n-1).

Hotelling's two-sample T-square statistic

If {mathbf x}_1,dots,{mathbf x}_{n_x}sim N_p(oldsymbol{mu},{mathbf V}) and {mathbf y}_1,dots,{mathbf y}_{n_y}sim N_p(oldsymbol{mu}_Y,{mathbf V}), with the samples independently drawn from two independent multivariate normal distributions with the same mean and covariance, and we define

:overline{mathbf x}=frac{1}{n_x}sum_{i=1}^{n_x} mathbf{x}_i qquad overline{mathbf y}=frac{1}{n_y}sum_{i=1}^{n_y} mathbf{y}_ias the sample means, and:{mathbf W}= frac{sum_{i=1}^{n_x}(mathbf{x}_i-overline{mathbf x})(mathbf{x}_i-overline{mathbf x})'+sum_{i=1}^{n_y}(mathbf{y}_i-overline{mathbf y})(mathbf{y}_i-overline{mathbf y})'}{n_x+n_y-2}as the unbiased pooled covariance matrix estimate, then Hotelling's two-sample T-square statistic is

:t^2 = frac{n_x n_y}{n_x+n_y}(overline{mathbf x}-overline{mathbf y})'{mathbf W}^{-1}(overline{mathbf x}-overline{mathbf y})sim T^2(p, n_x+n_y-2)

and it can be related to the F-distribution by

:frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 sim F(p,n_x+n_y-1-p).

The non-null distribution of this statistic is the noncentral F-distribution (the ratio of a non-central Chi-square random variable and an independent central Chi-square random variable) :frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 sim F(p,n_x+n_y-1-p;delta),with :delta = frac{n_x n_y}{n_x+n_y}oldsymbol{ u}'mathbf{V}^{-1}oldsymbol{ u},where oldsymbol{ u} is the difference vector between the population means.

ee also

* Student's t-distribution (the univariate equivalent)
* F-distribution (commonly tabulated or available in software libraries, and hence used for testing the T-square statistic using the relationship given above)
* Wilks' lambda distribution (in multivariate statistics Wilks' Lambda is to Hotelling's T^2 as Snedecor's F is to Student's t in univariate statistics).

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Hotelling — may refer to: *An office organization method known as hotelling *An American statistician and economist Harold Hotelling *An economic rule regarding rent, Hotelling s rule *An economic rule relating the supply of a good to the profit of the good… …   Wikipedia

  • Harold Hotelling — Infobox Scientist name = Harold Hotelling caption = birth date = Birth date|1895|9|29 birth place = Fulda, Minnesota, U.S. death date = death date and age|1973|12|26|1895|9|29 death place = Chapel Hill, North Carolina, U.S. residence = U.S.… …   Wikipedia

  • Wishart distribution — Probability distribution name =Wishart type =density pdf cdf parameters = n > 0! deg. of freedom (real) mathbf{V} > 0, scale matrix ( pos. def) support =mathbf{W}! is positive definite pdf =frac{left|mathbf{W} ight|^frac{n p 1}{2… …   Wikipedia

  • Student's t-distribution — Probability distribution name =Student s t type =density pdf cdf parameters = u > 0 degrees of freedom (real) support =x in ( infty; +infty)! pdf =frac{Gamma(frac{ u+1}{2})} {sqrt{ upi},Gamma(frac{ u}{2})} left(1+frac{x^2}{ u} ight)^{ (frac{… …   Wikipedia

  • F-distribution — Probability distribution name =Fisher Snedecor type =density pdf cdf parameters =d 1>0, d 2>0 deg. of freedom support =x in [0, +infty)! pdf =frac{sqrt{frac{(d 1,x)^{d 1},,d 2^{d 2{(d 1,x+d 2)^{d 1+d 2{x,mathrm{B}!left(frac{d 1}{2},frac{d 2}{2}… …   Wikipedia

  • Wilks' lambda distribution — In statistics, Wilks lambda distribution (named for Samuel S. Wilks), is a probability distribution used in multivariate hypothesis testing, especially with regard to the likelihood ratio test. It is a generalization of the F distribution, and… …   Wikipedia

  • Multivariate Student distribution — Multivariate Student parameters: location (real vector) Σ scale matrix (positive definite real matrix) n is the degree of freedom support …   Wikipedia

  • Maxwell–Boltzmann distribution — Maxwell–Boltzmann Probability density function Cumulative distribution function parameters …   Wikipedia

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

  • Chi-squared distribution — This article is about the mathematics of the chi squared distribution. For its uses in statistics, see chi squared test. For the music group, see Chi2 (band). Probability density function Cumulative distribution function …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”