Johnson's Transformation - Minitab 14 - How can I find out Ppk - Non-normal Distribu?

D

Deanmi

Hello all,

I used Johnson's transformation (minitab 14) to make non-normal data normal in order to get Ppk values for the data. However, there are a few sets that after the transformation Minitab doesn't display any Ppk values (and shows asterisk) although the data sets are normal (high P-value).

I haven't been able to find out the reason for this in Minitab help menu. Does anyone can help me with this? How can I find out Ppk for my non-normal data sets? Thanks.
 

Miner

Forum Moderator
Leader
Admin
Have you tried Minitab's Capability Analysis for Nonnormal distributions?
 
D

Deanmi

Yes. I used minitab 14 to do capability analysis for non-normal data. With a few sets of my data sets I was successful in getting Ppk values,but some others I got Asterisk!!

Thanks for following up!
 

Miner

Forum Moderator
Leader
Admin
Can you attach a problem data set in Excel with the specifications? I have Minitab and will review the analysis.
 
D

Deanmi

Thanks!

I have attached my data and would like to hear your opinion. But I know what the issue is now. However, I don't have a solution for it.

The data points are all positive numbers; however, limits are -0.2 and 0.2. The transformation algorithm has found Johnson transformation of k2 type curve, i.e. SB- bounded. Hence, during Ppk calculation, an attempt is made to find the log of a negative number by minitab!

Now, is there a way to force Johnson transformation to use k1 type of transformation (SU- unbounded)?

P.S. For some reason I get an error during uploading the excel file. I am typing in the data pooints. There are 16 data points only. The lower and upper limits are -0.2 and 0.2.

0.04799
0.05253
0.0863
0.04157
0.04107
0.0443
0.0611
0.04784
0.05961
0.03917
0.04512
0.05762
0.05059
0.04037
0.0421
0.05436

Thanks for your effort!
 

Miner

Forum Moderator
Leader
Admin
I did some analysis with this data, but I would hestitate to draw a conclusion with only 16 data points. Since many non-normal distributions cannot have values less than zero, and your tolerances indicate that it is very possible to have negative values, it would help to have a larger data set that includes a few of these negative values (naturally occurring).

When I review the histogram, I am very reluctant to accept the results of the test for normality because small shifts in a few of the extreme values result in a conclusion for Normality. If you double the number of data points, you may reach a totally different conclusion.
 
D

Deanmi

Thanks for your replies and anaylsis.

I have also been reviewing the data myself. And I am questioning the stability of the process.

I have read your reply a few times, and a sentence in the reply is not very clear to me. You wrote, "Since many non-normal distributions cannot have values less than zero, and ..." Are you saying if a distribution is naturally unbounded, and the target is the middle of the tolerance zone, a normal distribution should be expected otherwise the validity of data points is in question? Or the word "many" in the sentence allows for non-normal distributions in this situation for very rare and special cases?
 

Miner

Forum Moderator
Leader
Admin
Deanmi said:
I have read your reply a few times, and a sentence in the reply is not very clear to me. You wrote, "Since many non-normal distributions cannot have values less than zero, and ..." Are you saying if a distribution is naturally unbounded, and the target is the middle of the tolerance zone, a normal distribution should be expected otherwise the validity of data points is in question? Or the word "many" in the sentence allows for non-normal distributions in this situation for very rare and special cases?

No. Many of the nonnormal distributions available to try to fit the data have a lower limit. This is the same issue that you encountered trying to take the log of a negative number.

Since your process does not have a lower limit, it does not make sense to fit the data with a distribution that does have a lower limit. For example, I fit the data very well using an exponential distribution. However, this distibution showed that the data had a lower limit that could not be passed. The reality of the process says that this cannot be true. Therefore, using this distribution, no matter how good the fit is the wrong thing to do. Always do a reality check.

I believe that the situation that we are seeing is that the small sample size does not accurately reflect the parent population. A larger sample size would better reflect this population and may actually be a normal distribution.

Nonnormal distributions "typically" have an underlying cause that becomes easy to understand in hindsight. For example, machining operations that run to a hard stop or on cams, will almost always be skewed. There will be a high mode near the hard stop with a tail extending in the direction before the hard stop is reached. Form tolerances such as flatness may be normal when they are far from zero, but as flatness is reduced, the physical limit/boundary of zero will impose a more and more skewed distribution as the mode is shifted closer to zero. Physical test results may often be skewed with the tail to the high side. You always seem to get a few test results where everything just seem to work out perfect and the results were extraordinarily good.
 

Statistical Steven

Statistician
Leader
Super Moderator
I did a 1/X transformation, got a very nice normal distribution. The new limits are -5 to 5.

The results are not very pretty (not very capable). This brings up a very interesting issue about "blinding" transforming data. If your process gives you data that is not normal, without knowing the true distribution of the data, you can find a transformation that will make that data set normal, but given another 16 values might not make the data set normal. Be cautious with any transformation.



Miner said:
No. Many of the nonnormal distributions available to try to fit the data have a lower limit. This is the same issue that you encountered trying to take the log of a negative number.

Since your process does not have a lower limit, it does not make sense to fit the data with a distribution that does have a lower limit. For example, I fit the data very well using an exponential distribution. However, this distibution showed that the data had a lower limit that could not be passed. The reality of the process says that this cannot be true. Therefore, using this distribution, no matter how good the fit is the wrong thing to do. Always do a reality check.

I believe that the situation that we are seeing is that the small sample size does not accurately reflect the parent population. A larger sample size would better reflect this population and may actually be a normal distribution.

Nonnormal distributions "typically" have an underlying cause that becomes easy to understand in hindsight. For example, machining operations that run to a hard stop or on cams, will almost always be skewed. There will be a high mode near the hard stop with a tail extending in the direction before the hard stop is reached. Form tolerances such as flatness may be normal when they are far from zero, but as flatness is reduced, the physical limit/boundary of zero will impose a more and more skewed distribution as the mode is shifted closer to zero. Physical test results may often be skewed with the tail to the high side. You always seem to get a few test results where everything just seem to work out perfect and the results were extraordinarily good.
 
D

Deanmi

Thanks to both of you for your comments!

I am always cautious (and skeptical at times) of using transformation! But I have many sets of non-normal data that I am not sure of the reason for them being non-normal! I understand having more data points may make the data normal, but for now, I have to live with 16 point data sets! Thanks again for your comments.
 
Top Bottom