# Non-Normal Data - Transforming the Data to Normal

J

#### Jon O

Normalizing Data

Hello All,

If a group of data is non-normal and we want to try to transform the data to normal, please explain some of the tools being used for the transformation.

In working with some of our six sigma customers, one method they are using on a regular basis is the central limit therom. Basically grouping 2 to 3 data points and taking the average and then calculating normality and capability on the average data. \

What are the thoughts of other cove members on this method?

Any input would be appreciated.

Regards,

Jon

Elsmar Forum Sponsor
R

#### Rick Goodson

Jon O,

It is difficult to comment on your question with out more information. How did you ascertain the data is non-normal? Did you use a standard test for normality or a graphical approach? What type of process is the data taken from? A little more information on the process would help. Nevertheless, a note on the transformation based on the Central Limit Theorem.

The Central Limit Theorem states that irrespective of the shape of the distribution of a universe, the distribution of average values, X-bar's, of subgroups of size n drawn from that universe will tend toward a normal distrbution as the subgroup size n grows without bound. The value of n does not have to be very large before the normal distribution may be applied. However, this is very useful in analyzing probabilities but does not form the basis for control charts with +/- 3 sigma limits (reference Statistical Quality Control by Grant and Leavenworth, seventh edition).

There are a number of data transformation methods available. You might consider the Weibull distribution as it is applicable to a wide variety of variations patterns including departures from both the normal and exponential distributions.

J

#### Jon O

Rick,

Thanks for the quick response. The assumption of normality was ascertained from performing an Anderson-Darling test for normality and reviewing the P-value. The data being analyzed is from an insertion/force test.

Nevertheless, do you feel it o.k. to calculate a capability on data set where the CLT has been applied?

Is there any point where youdon't go any further with the CLT? Averaging 2,3,4 datapoints....where do you stop???

Should CLT be considered a first round tool to normalize data or are we using statistics to fudge a data set that is truely messed up?

Thanks,

Jon

Staff member
Admin
Nice links!

J

#### jasshe

minitab r14 help us to transform the data to follow a normal distribution by at least the following two ways:

1.Box-Cox transformation
2. Johnson transformation

#### Darius

##### Quite Involved in Discussions
As I said, my own favorite is box-cox (even as regresion tool, it's the first regresion model (with the hiperbolic model) that I use when I see a good behabiur on the non lineal curve and when I try to find an equation for a curve found in a book), but for capability I stick on non parametrical (median and percentile).

I get woried, when I try to understand the meaning of capability when transformations are involved. Of course the specs can be transformed too but...., I feel something get missing.

J

#### jasshe

Darius said:
As I said, my own favorite is box-cox (even as regresion tool, it's the first regresion model (with the hiperbolic model) that I use when I see a good behabiur on the non lineal curve and when I try to find an equation for a curve found in a book), but for capability I stick on non parametrical (median and percentile).

I get woried, when I try to understand the meaning of capability when transformations are involved. Of course the specs can be transformed too but...., I.

happy to discuss "box-cox"with all of you,but i do not know " feel something get missing" mean what,
you did not get enough information about you procss capability from this methods ?

#### Darius

##### Quite Involved in Discussions
As I said, the best capability index is obtained by box-cox as is shown in the article of Quality and Reliability Engineering International
"COMPUTING PROCESS CAPABILITY INDICES FOR NON-NORMAL DATA: A REVIEW AND COMPARATIVE STUDY" by Loon Ching Tang and Su Ee Than

The greatest comparisson of capability index I ever seen.

But: as Wheeler wrote in Advanced topics on SPC,

"If the users have not already developed the ability to handle the mathematical abstraction of thinking about transforming a given set of data in different ways, the introduction of transformations as part of a statistical analysis will tend to confuse the results and confound the user"

".. and the treatment effects must be transformed back. This inverse transformation will present difficulties of interpretation that will overhelm many"

"The best analysis is that analysis wich provides the greatest insight with the simplest technique."

If the process capability is going to be presented to somebody else, the use of transformations as D. Wheeler pointed out, could make the results difficult to interpretate.

#### bobdoering

##### Stop X-bar/R Madness!!
Trusted Information Resource
Re: Normalizing Data

Before attempting a transformation, I would ponder whether is would be more meaningful to use the native distribution. In some cases transformations mask meaningful data - which is wasted effort.

Thread starter Similar threads Forum Replies Date
How to evaluate the process capability of a data set that is non-normal (cannot be transformed and does not fit any known distribution)? Capability, Accuracy and Stability - Processes, Machines, etc. 12
Non Normal Data in a historically normal process Capability, Accuracy and Stability - Processes, Machines, etc. 6
Y Process Capability for Non-Normal Data - Philosophical Questions Capability, Accuracy and Stability - Processes, Machines, etc. 6
P Non-normal Data Cpk Statistical Analysis Tools, Techniques and SPC 5
S Non-Normal Data - Measurement for "straightness" with a 0.001" max tolerance Capability, Accuracy and Stability - Processes, Machines, etc. 10
Calculating Cpk on Non-Normal Data Distribution Capability, Accuracy and Stability - Processes, Machines, etc. 10
J Non-Normal Distribution Data - Tolerance Intervals and Minitab Using Minitab Software 7
M t-test with Non Normal Data Statistical Analysis Tools, Techniques and SPC 16
K Non-Normal Data Analysis Literature, Websites, Books for Learning Quality Tools, Improvement and Analysis 2
R Transforming or not Transforming - Dealing with Non-Normal Data Statistical Analysis Tools, Techniques and SPC 10
Non-Normal Data F test and T test Capability, Accuracy and Stability - Processes, Machines, etc. 19
M Non-Normal Data & Minitab Statistical Analysis Tools, Techniques and SPC 17
M Non Normal Data Managing - Clements Method to get Process Characteristics Statistical Analysis Tools, Techniques and SPC 8
M How to Do Ppk on Non-Normal Data Capability, Accuracy and Stability - Processes, Machines, etc. 10
T Non-normal data for Cp Cpk indices Statistical Analysis Tools, Techniques and SPC 1
U Calculate Z-Score for Non-Normal Data - I have two set of data Statistical Analysis Tools, Techniques and SPC 9
S Capability Analysis of Non-Normal Data Statistical Analysis Tools, Techniques and SPC 23
T Cp and Cpk calculation for Non-normal data? Statistical Analysis Tools, Techniques and SPC 11
T Capability with Non-Normal Data - Component Strength Statistical Analysis Tools, Techniques and SPC 11
Y Perform 2 ways ANOVA in minitab with non normal data involved Using Minitab Software 5
B How to identify whether my data is non-normal? Statistical Analysis Tools, Techniques and SPC 26
Quality control with non-normal, censored and truncated data examples needed Statistical Analysis Tools, Techniques and SPC 0
Use of Non-Parametric Statistics and Non-Normal Data Statistical Analysis Tools, Techniques and SPC 33
N Analysis of non-normal stratified data for cpk/ppk? Rupture test Capability, Accuracy and Stability - Processes, Machines, etc. 6
R P-value is 0.05 - Normal or non-normal data? Capability, Accuracy and Stability - Processes, Machines, etc. 31
L How to calculate tolerance intervals for non-normal data? Reliability Analysis - Predictions, Testing and Standards 13
J Is a t-test used in non-normal data analysis? Six Sigma 7
C Analyzing Data with a Non-Normal distribution Statistical Analysis Tools, Techniques and SPC 41
A Capability Analysis - Dealing with non-normal data in Minitab Using Minitab Software 8
V MINITAB and non-normal unilateral tolerance data - Cannot confirm Ppk values Using Minitab Software 15
R Bootstraping and Resampling Statistics - Capability for non-normal variable data Inspection, Prints (Drawings), Testing, Sampling and Related Topics 6
A Non-normal Distributions in SPC - How do I Normalize Data? Statistical Analysis Tools, Techniques and SPC 55
Apply control limits to a non-normal distribution Statistical Analysis Tools, Techniques and SPC 13
Non-normal Distribution Selection where the system is constantly being corrected Capability, Accuracy and Stability - Processes, Machines, etc. 11
Process Capability for parameters with non-normal distribution. Capability, Accuracy and Stability - Processes, Machines, etc. 16
M Is it possible to get Natural Tolerance (Tn) with Non Normal Distribution? Statistical Analysis Tools, Techniques and SPC 9
Test of Means or Medians for Non Normal Populations? Using Minitab Software 6
S Which Normality Test more acceptable to FDA; Also, Non-Normal Threshold? Qualification and Validation (including 21 CFR Part 11) 5
S How you define 'Normal & Abnormal' Conditions and 'Routine & Non-routine' Activities ISO 14001:2015 Specific Discussions 8
A Finding Control Limits for a Non-Normal Distribution Statistical Analysis Tools, Techniques and SPC 3
R Torque Confidence Testing (Non-normal Distribution) Statistical Analysis Tools, Techniques and SPC 5
F Non-Normal Distribution vs. Gamma Distribution Statistical Analysis Tools, Techniques and SPC 18
F Process Capability for Non-Normal Process Statistical Analysis Tools, Techniques and SPC 5
C Designing control chart for non-normal variables Statistical Analysis Tools, Techniques and SPC 3
M Cp, Cpk, Dpm, or other? Diameter of a Hole - Non-Normal Distribution Capability, Accuracy and Stability - Processes, Machines, etc. 20
A ANSI Z1.4 or Z1.9 for non-normal distributions? Inspection, Prints (Drawings), Testing, Sampling and Related Topics 5
D Ppk of non-normal bounded proccess, one sided specification Capability, Accuracy and Stability - Processes, Machines, etc. 3
J Capability of Inherently Non-Normal Process - Plating Process Thickness Distribution Statistical Analysis Tools, Techniques and SPC 9
D Control charts for non-normal distributions - Do I need to do anything special? Statistical Analysis Tools, Techniques and SPC 5
D Johnson's Transformation - Minitab 14 - How can I find out Ppk - Non-normal Distribu? Using Minitab Software 9