D'Agostino's K-squared test

From Wikipedia, the free encyclopedia

In statistics, D'Agostino's K2 test is a goodness-of-fit measure of departure from normality, based on transformations of the sample kurtosis and skewness. The test statistic K2 is obtained as follows:

In the following derivation, n is the number of observations (or degrees of freedom in general); \sqrt{b_1} is the sample skewness, b2 is the sample kurtosis, defined as


\sqrt{ b_1 } = \frac{ \mu_3 }{ \sigma^3 } = \frac{ \mu_3 }{ \left( \sigma^2 \right)^{3/2} } = \frac{ \frac{1}{n} \sum_{i=1}^n \left( x - \bar{x} \right)^3}{ \left( \frac{1}{n} \sum_{i=1}^n \left( x - \bar{x} \right)^2 \right)^{3/2}}

b_2 = \frac{ \mu_4 }{ \sigma^4 } = \frac{ \mu_4 }{ \left( \sigma^2 \right)^{2} } = \frac{\frac{1}{n} \sum_{i=1}^n \left( x - \bar{x} \right)^4}{\left( \frac{1}{n} \sum_{i=1}^n \left( x - \bar{x} \right)^2 \right)^2}

where μ3 and μ4 are the third and fourth central moments, respectively, \bar{x} is the sample mean, and σ2 is the second central moment, the variance.

Contents

[edit] Transformed Skewness

First, calculate Z\left(\sqrt{b_1}\right), a transformation of the skewness \sqrt{b_1} that is approximately normally distributed under the null hypothesis that the data are normally distributed.


Y = \sqrt{b_1} \cdot \sqrt{\frac{(n+1)(n+3)}{6(n-2)}}

\beta_2\left(\sqrt{b_1}\right) = \frac{3(n^2+27n-70)(n+1)(n+3)}{(n-2)(n+5)(n+7)(n+9)}

W^2 = -1 + \sqrt{2 (\beta_2\left(\sqrt{b_1}\right) - 1)}

\delta = 1/\sqrt{ln(W)}

\alpha = \sqrt{\frac{2}{W^2-1}}

Z\left(\sqrt{b_1}\right) = \delta ln\left(Y/\alpha + \sqrt{(Y/\alpha)^2 + 1}\right)

[edit] Transformed Kurtosis

Next, calculate Z\left(b_2\right), a transformation of the kurtosis b2 that is approximately normally distributed under the null hypothesis that the data are normally distributed.


E\left(b_2\right) = \frac{3(n-1)}{n+1}

\sigma^2_{b_2} = \frac{24n(n-2)(n-3)}{(n+1)^2(n+3)(n+5)}

x = \frac{b_2 - E\left(b_2\right)}{\sigma_{b_2}}

Next, compute the skewness of the kurtosis:


\sqrt{\beta_1\left(b_2\right)} = \frac{6(n^2-5n+2)}{(n+7)(n+9)} \sqrt{\frac{6(n+3)(n+5)}{n(n-2)(n-3)}}

A = 6 + \frac{8}{\sqrt{\beta_1\left(b_2\right)}} \left[ \frac{2}{\sqrt{\beta_1\left(b_2\right)}} + \sqrt{1+\frac{4}{\beta_1\left(b_2\right)}}\right]

Z\left(b_2\right) = \left(\left(1 - \frac{2}{9A}\right) - \sqrt[3]{\frac{1-2/A}{1+x\sqrt{2/(A-4)}}}\right)\sqrt{\frac{9A}{2}}

[edit] Omnibus K2 statistic

Now, we can combine Z\left(\sqrt{b_1}\right) and Z\left(b_2\right) to define D'Agostino's Ombibus K2 test for normality.


K^2 = \left(Z\left(\sqrt{b_1}\right)\right)^2 + \left(Z\left(b_2\right)\right)^2

K2 is approximately distributed as χ2 with 2 degrees of freedom.

[edit] References

  • D'Agostino, Ralph B., Albert Belanger, and Ralph B. D'Agostino, Jr. "A Suggestion for Using Powerful and Informative Tests of Normality", The American Statistician, Vol. 44, No. 4. (Nov., 1990), pp. 316-321.
Languages