Monday, March 31, 2008

How useful are normality tests?



How useful are normality tests?
source: graphpad.com


Many data analysis methods (t test, ANOVA, regression) depend on the assumption that data were sampled from a Gaussian distribution. The best way to evaluate how far your data are from Gaussian is to look at a graph and see if the distribution deviates grossly from a bell-shaped normal distribution.
How useful are statistical methods to test for normality? Less useful than you’d guess. Consider these potential problems:
  • Small samples almost always pass a normality test. Normality tests have little power to tell whether or not a small sample of data comes from a Gaussian distribution.
  • With large samples, minor deviations from normality may be flagged as statistically significant, even though small deviations from a normal distribution won’t affect the results of a t test or ANOVA.
  • Decisions about when to use parametric vs. nonparametric tests should usually be made to cover an entire series of analyses. It is rarely appropriate to make the decision based on a normality test of one data set.

I think it is usually a mistake to test every data set for normality, and use the result to decide between parametric and nonparametric statistical tests. But normality tests can help you understand your data, especially when you get similar results in many experiments.


Which normality test is best?

When we added a normality test to Prism several years ago, we selected the one that was best known, the Kolmogorov-Smirnov test. This test compares the cumulative distribution of the data with the expected cumulative Gaussian distribution, and bases its P value on the largest discrepancy.
The Kolmogorov-Smirnov test was designed to compare two experimentally-determined distributions. When testing for normality, you compare an experimental distribution against a hypothetical ideal, so its necessary to apply the Dallal-Wilkinson-Lilliefors correction. Prism versions 4.02 and 4.0b do this, but earlier versions did not.
The Kolmogorov-Smirnov test is based on a simple way to quantify the discrepancy between the observed and expected distributions. It turns out, however, that it is too simple, and doesn't do a good job of discriminating whether or not your data was sampled from a Gaussian distribution. An expert on normality tests, R.B. D’Agostino, makes a very strong statement: “The Kolmogorov-Smirnov test is only a historical curiosity. It should never be used.” (“Tests for Normal Distribution” in Goodness-of-fit Techniques, Marcel Decker, 1986).
In Prism 4.02 and 4.0b, we added two more normality tests to Prism’s column statistics analysis the Shapiro-Wilk normality test and the D’Agostino-Pearson omnibus test.
All three procedures test the same the null hypothesis – that the data are sampled from a Gaussian distribution. The P value answers this question: If the null hypothesis were true, what is the chance of randomly sampling data that deviate as much (or more) from Gaussian as the data we actually collected? The three tests differ in how they quantify the deviation of the actual distribution from a Gaussian distribution.

The Shapiro-Wilk normality test is difficult for nonmathematicians to understand, and it doesn't work well when several values in your data set are the same. In contrast, the D’Agostino-Pearson omnibus test is easy to understand. It first analyzes your data to determine
skewness (to quantify the asymmetry of the distribution) and kurtosis (to quantify the shape of the distribution). It then calculates how far each of these values differs from the value expected with a Gaussian distribution, and computes a single P value from the sum of the squares of these discrepancies. Unlike the Shapiro-Wilk test, this test is not affected if the data contains identical values.

If you to decide use normality tests, I recommend that you stop using the Kolmogorov-Smirnov test, and switch instead to the D’Agostino-Pearson omnibus test. But first rethink whether the normality tests are providing you with useful information.

No comments: