# Normal Distribution - How do I test for normality? All the data or just the averages?

D

#### DJN

Having been advised that the data in a control chart should be a normal distribution, I now ask the question; how do I test for normality? And, should I be testing all the data or just the averages?

David

M

#### M Greenaway

I believe there may be some mathematical way to test for normality, however it is far easier to plot a histogram of results and just look to see if the distribution appears to look like a normal distribution curve.

D

#### DJN

OK easily done, but should the histogram be based on all the data or just the averages?

David

A

#### Atul Khandekar

M Greenaway said:

I believe there may be some mathematical way to test for normality, however it is far easier to plot a histogram of results and just look to see if the distribution appears to look like a normal distribution curve.
Yes, that's the visual or eyeball test for normality - does the distribution appear bell shaped- single peak and tapering off equally on both sides.
Mathematically, there are several tests Chi-Square, Anderson-Darling, Kolmogorov-Smirnoff test etc etc
Refer:http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm

A

#### Atul Khandekar

DJN said:

OK easily done, but should the histogram be based on all the data or just the averages?

David
The central Limit theorem states that as the sample size becomes large the distribution of means approximates normal regardless of the distribution of original values. This distribution of means is centered at the mean of raw data values. However, the Std. Deviation of means is s/sqrt(N), where s=std.dev.of raw data & N=sample size

Another test for normality is to do a Normal Probability Plot.
Here's a link from good old NIST handbook again:
http://www.itl.nist.gov/div898/handbook/eda/section3/histogr1.htm

D

#### Dave Strouse

Normal?

Having been advised that the data in a control chart should be a normal distribution, I now ask the question; how do I test for normality? And, should I be testing all the data or just the averages?
Just Curious, who advised you that the data "needs" to be normally distributed?

D

#### DJN

Thanks to all for the help. Things are a little clearer now!! Dave, the question of normality arose from a question I posed on CP and CPk values, where I believe the data has to be normal, or have I got it wrong?

David

R

#### Rick Goodson

DJN,

Well, this should get some interesting discussion going...

As Atul said, the Central Limit Theorem states that regardless of the parent population shape (a square distribution, triangular, bi-model, etc,) as the sample size becomes large compared to the parent (usually taken at a minimum of 30 samples) the average of the means is centered at the average of the raw data and the the standard deviation of the samples is related to the standard deviation of the raw data by a formula. So for a subgroup size of 5 the sample standard deviation is equal to 0.45 the standard deviation of the parent. At subgroup size of 4 the standard deviation of the x-bars is equal to 0.50 the standard deviation of the parent. So.... the parent population does not have to be normally distributed yet the sample distribution will always be normally distributed. If you test for normality and the sample population is not normally distributed, there is something wrong with the data or the data collection method.

While I agree with Atul on the eyeball method, I always confirm that with a graphical or calculated test.

C

#### Cristi?nC

DJN:

Take a look on this site:

http://www.ms.uky.edu/~lancastr/java/cltexp.html

It has a simple applet that shows how the average of a very skewed distribution (an exponential one) becomes more and more normal as the sample size increases.

By the way: Forget about normality if you are using xbar / R charts. The I & MR charts are more sensible to deviations from normality and this assumption must be checked if this is the case.

Hope this helps.