D'Agostino's K-squared test

D'Agostino's K-squared test

In statistics, D’Agostino’s K2 test is a goodness-of-fit measure of departure from normality, that is the test aims to establish whether or not the given sample comes from a normally distributed population. The test is based on transformations of the sample kurtosis and skewness, and has power only against the alternatives that the distribution is skewed and/or kurtic.

Contents

Skewness and kurtosis

In the following, let { xi } denote a sample of n observations, g1 and g2 are the sample skewness and kurtosis, mj’s are the j-th sample central moments, and \bar{x} is the sample mean. (Note that quite frequently in the literature related to normality testing the skewness and kurtosis are denoted as √β1 and β2 respectively. Such notation is less convenient since for example √β1 can be a negative quantity).

The sample skewness and kurtosis are defined as

\begin{align}
    & g_1 = \frac{ m_3 }{ m_2^{3/2} } = \frac{\frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^3}{\left( \frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 \right)^{3/2}}\ , \\
    & g_2 = \frac{ m_4 }{ m_2^{2} }-3 = \frac{\frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^4}{\left( \frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 \right)^2} - 3\ .
  \end{align}

These quantities consistently estimate the theoretical skewness and kurtosis of the distribution. Moreover, if the sample indeed comes from a normal population, then the exact finite sample distributions of the skewness and kurtosis can themselves be analysed in terms of their means μ1, variances μ2, skewnesses γ1, and kurtoses γ2. This has been done by Pearson (1931), who derived the following expressions:[non-primary source needed]

\begin{align}
    & \mu_1(g_1) = 0, \\
    & \mu_2(g_1) = \frac{ 6(n-2) }{ (n+1)(n+3) }, \\
    & \gamma_1(g_1) \equiv \frac{\mu_3(g_1)}{\mu_2(g_1)^{3/2}} = 0, \\
    & \gamma_2(g_1) \equiv \frac{\mu_4(g_1)}{\mu_2(g_1)^{2}}-3 = \frac{ 36(n-7)(n^2+2n-5) }{ (n-2)(n+5)(n+7)(n+9) }.
  \end{align}

and

\begin{align}
    & \mu_1(g_2) = - \frac{6}{n+1}, \\
    & \mu_2(g_2) = \frac{ 24n(n-2)(n-3) }{ (n+1)^2(n+3)(n+5) }, \\
    & \gamma_1(g_2) \equiv \frac{\mu_3(g_2)}{\mu_2(g_2)^{3/2}} = \frac{6(n^2-5n+2)}{(n+7)(n+9)} \sqrt{\frac{6(n+3)(n+5)}{n(n-2)(n-3)}}, \\
    & \gamma_2(g_2) \equiv \frac{\mu_4(g_2)}{\mu_2(g_2)^{2}}-3 = \frac{ 36(15n^6-36n^5-628n^4+982n^3+5777n^2-6402n+900) }{ n(n-3)(n-2)(n+7)(n+9)(n+11)(n+13) }.
  \end{align}

For example, a sample with size n = 1000 drawn from a normally distributed population can be expected to have a skewness of 0 ± 0.08 and a kurtosis of 0 ± 0.15, where the ± indicates the standard deviation.

Transformed sample skewness and kurtosis

The sample skewness g1 and kurtosis g2 are both asymptotically normal. However the rate of their convergence to the distribution limit is frustratingly slow, especially for g2. For example even with n = 5000 observations the sample kurtosis g2 has both the skewness and the kurtosis of approximately 0.3, which is not negligible. In order to remedy this situation, it has been suggested to transform the quantities g1 and g2 in a way that makes their distribution as close to standard normal as possible.

In particular, D’Agostino (1970) suggested the following transformation for sample skewness:


    Z_1(g_1) = \delta\cdot \ln\!\left( \frac{g_1}{\alpha\sqrt{\mu_2}} + \sqrt{\frac{g_1^2}{\alpha^2\mu_2} + 1}\right),

where constants α and δ are computed as

\begin{align}
    & W^2 = \sqrt{2\gamma_2 + 4} - 1, \\
    & \delta = 1 / \sqrt{\ln W}, \\
    & \alpha^2 = 2 / (W^2-1), \\
  \end{align}

and where μ2 = μ2(g1) is the variance of g1, and γ2 = γ2(g1) is the kurtosis — the expressions given in the previous section.

Similarly, Anscombe & Glynn (1983) suggested a transformation for g2, which works reasonably well for sample sizes of 20 or greater:


    Z_2(g_2) = \sqrt{\frac{9A}{2}} \left\{1 - \frac{2}{9A} - \left(\frac{ 1-2/A }{ 1+\frac{g_2-\mu_1}{\sqrt{\mu_2}}\sqrt{2/(A-4)} }\right)^{\!1/3}\right\},

where


    A = 6 + \frac{8}{\gamma_1} \left( \frac{2}{\gamma_1} + \sqrt{1+4/\gamma_1^2}\right),

and μ1 = μ1(g2), μ2 = μ2(g2), γ1 = γ1(g2) are the quantities computed by Pearson.

Omnibus K2 statistic

Statistics Z1 and Z2 can be combined to produce an omnibus test, able to detect deviations from normality due to either skewness or kurtosis (D’Agostino, Belanger & D’Agostino 1990):


    K^2 = Z_1(g_1)^2 + Z_2(g_2)^2\,

If the null hypothesis of normality is true, then K2 is approximately χ2-distributed with 2 degrees of freedom.

Note that the statistics g1, g2 are not independent, only uncorrelated. Therefore their transforms Z1, Z2 will be dependent also (Shenton & Bowman 1977), rendering the validity of χ2 approximation questionable. Simulations show that under the null hypothesis the K2 test statistic is characterized by

expected value standard deviation 95% quantile
n = 20 1.971 2.339 6.373
n = 50 2.017 2.308 6.339
n = 100 2.026 2.267 6.271
n = 250 2.012 2.174 6.129
n = 500 2.009 2.113 6.063
n = 1000 2.000 2.062 6.038
χ2(2) distribution 2.000 2.000 5.991

References


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Test de normalité — En statistiques, les tests de normalité permettent de vérifier si des données réelles suivent une loi normale ou non. Les tests de normalité sont des cas particuliers des tests d adéquation (ou tests d ajustement, tests permettant de comparer des …   Wikipédia en Français

  • Normality test — In statistics, normality tests are used to determine whether a data set is well modeled by a normal distribution or not, or to compute how likely an underlying random variable is to be normally distributed. More precisely, they are a form of… …   Wikipedia

  • D'Agostino — is an Italian surname related to St. Augustine. People Antonino D Agostino (born 1978), Italian footballer Frank D Agostino (born 1978), American filmmaker Gaetano D Agostino (born 1982), Italian footballer Gigi D Agostino (born 1967), Italian DJ …   Wikipedia

  • Tests de normalité — Test de normalité En statistiques, les tests de normalité permettent de vérifier que des données réelles suivent une loi normale ou non. Les tests de normalité sont des cas particuliers des tests d adéquation (ou tests d ajustement, tests… …   Wikipédia en Français

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

  • Skewness — Example of experimental data with non zero (positive) skewness (gravitropic response of wheat coleoptiles, 1,790) In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real valued random …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Loi normale — Distribution gaussienne Densité de probabilité / Fonction de masse La courbe rouge représente la fonction φ (voir texte), densité de probabilité d une variable suivant une loi normale centrée réduite Fonction de répartition …   Wikipédia en Français

  • Distribution gaussienne — Loi normale Distribution gaussienne Densité de probabilité / Fonction de masse La courbe rouge représente la fonction φ (voir texte), densité de probabilité d une variable suivant une loi normale centrée réduite Fonction de répartition …   Wikipédia en Français

  • Distribution normale — Loi normale Distribution gaussienne Densité de probabilité / Fonction de masse La courbe rouge représente la fonction φ (voir texte), densité de probabilité d une variable suivant une loi normale centrée réduite Fonction de répartition …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”