# Capability analysis for Geometric tolerances?

#### alek333

##### Registered
Hello,

I have been searching throughout the internet for an answer to my questions but I can't quite find what I'm looking for. Looking through previous threads on this forum, I'm having a hard time understanding... I'm only a novice in statistics so forgive me.

I'm trying to assess the capability of various GD&T features such as true position, flatness, perpendicularity etc.
I'm supposed to calculate the Cpk and Ppk of the collected data from these features.

But does this make sense??
I thought the AIAG manual says that we have to assume normality and a two sided specification in order to calculate Cpk and Ppk?
From my understanding, data from true position/flatness etc will probably not be a normal distribution and it is not a two sided specification. Yoiu can't get past 0...also you want to get 0 as close as possible.

So with that in mind, how are Cpk and Ppk calculations done with these GD&T features?

Using the latest version of Minitab I can put 0 as the lower spec limit and select it as a hard boundary and I guess that can be used to calculate Cpk and Ppk in these situations?

I'm assuming I have to transform the data beforehand to normal distribution/select a different distribution model before doing that though. Will this give me something meaningful? I don't know if Minitab accounts for that when you select that lower boundary option.

Also does Ppk need to have a normally distributed data set in order to be accurate, or is that only for Cpk?

So many questions, I'm sorry.

Bump.

#### Miner

##### Forum Moderator
This is how Minitab treats Boundaries.

I do not recommend transforming the data as you lose important information about the process. Also, the specification must be transformed as well, which makes it both difficult to understand and difficult to explain. The Box-Cox transform does not allow zero values, so you have to fudge the lower boundary with a very small positive number. The following analyses were performed using random lognormal data to simulate a right skewed distribution common to GD&T.

I always use a nonnormal capability analysis when warranted. This way you can see the data as it actually is relative to the spec. The one downside is that you only get Pp/Ppk, no Cp/Cpk.

#### alek333

##### Registered
Thank you for the response! Let me summarize your points to make sure I'm getting it right.

So for capability studies involving geometric tolerancing, as long as you click the boundary option in Minitab, you can calculate meaningful capability index values involving one sided specification? This sort of goes against the PPAP manual from my understanding but I suppose I shouldn't treat it as the whole truth.

In order to still calculate Cpk, you have to assume the process is stable and normal. If not normal, you should not transform the data and instead find a distribution model that best fits your data. Clicking the boundary button will still not work well if the data is not normal and stable. A lot of my data does not match a normal distribution since its machining flatness or whatever so that is very useful to know. For geometric tolerance data, still need to assume stability and normality (otherwise pick a different distribution model)

Therefore you can't calculate Cpk for a lot of my data since so much of it is nonnormal. Some isn't even stable. Looks like I'm stuck calculating Ppk for most of it. It doesn't require stability, although I still have to pick a different distribution model if the data is not normal.

Although now that I think about it, if the data is statistically stable, the Cpk and the Ppk should be close to each other.

Regarding the difference between Ppk and Cpk, I have never found a cogent explanation on the difference between the two in terms of long term capability versus short term capability. Even on this forum there seems to not be a consensus. But that's discussion for another day...

Makes sense. Thank you for your input! You've really helped me understand how minitab works to run capability studies on this kind of data.

#### Miner

##### Forum Moderator
Clicking the boundary button will still not work well if the data is not normal and stable.
These are separate issues.
• Any capability analysis will not work well if the process is not stable.
• Whether the data are normal or not simply drives you to choose between a normal or nonnormal analysis.
• Use of the boundary is driven by whether a physical limit exists such as zero flatness, not by the distribution. In fact, a physical limit tends to force skewness in the data.
Regarding use of boundaries and the PPAP manual, can you cite the page and section number? I don't recall it mentioning boundaries, but it has been a while since I have looked.

Regarding short/long term variation: Short term variation is based entirely upon the variation WITHIN subgroups. Long term variation includes the variation BETWEEN subgroups. When a process is stable, the two should be very close to each other.

Otherwise, you seem to understand the rest.

Last edited:

#### alek333

##### Registered
These are separate issues.
• Any capability analysis will not work well if the process is not stable.
• Whether the data are normal or not simply drives you to choose between a normal or nonnormal analysis.
• Use of the boundary is driven by whether a physical limit exists such as zero flatness, not by the distribution. In fact, a physical limit tends to force skewness in the data.
Regarding use of boundaries and the PPAP manual, can you cite the page and section number? I don't recall it mentioning boundaries, but it has been a while since I have looked.

Regarding short/long term variation: Short term variation is based entirely upon the variation WITHIN subgroups. Long term variation includes the variation BETWEEN subgroups. When a process is stable, the two should be very close to each other.

Otherwise, you seem to understand the rest.

To summarize, the boundary option is when a physical limit exists or its not possible for the data to go past that limit for whatever reason. Such as flatness or other geometric tolerancing. However that is a separate issue from normality of the data or stability. Your data still has to meet those requirements.

Regarding stability of data, I thought Ppk could still be a valid calculation according to the AIAG manual.
I can't paste my snippet for some reason but see page 8 of the AIAG manual at the Initial Process Studies paragraph.

As for the one sided specification vs two sided specification see page 9 (2.2.11.5). My interpretation is that Cpk is not a reliable indicator for data with one sided specification.

#### Bev D

##### Heretical Statistician
Super Moderator
Can you quote the mentioned sections for those of us who don’t have access to the documents?

Memorize the following:
Data that has a boundary condition cannot be expected to have a Normal distribution, especially if the boundary is the ideal target (flatness)
Normality has nothing to do with stability. Many stable processes are not Normal at all.
Normality has nothing to do with capability - except in a choice of distributional formula to ‘predict defect rate’ (a very flawed but incorrect application that is popular with some companies…)
In general theory for capability indices (Cpk and Pk, etc.) there can be no reliable prediction without stability.
The only difference between Ppk and Cpk (as Miner said) is that Cpk uses the within subgroup variation (sometimes called short term) and Ppk uses the SD of all of the data (within and between; aka long term, except in pre-reduction were the Ppk formula is used for short term estimate as there may not be enough production to have a useful subgrouping scheme)

#### alek333

##### Registered
Can you quote the mentioned sections for those of us who don’t have access to the documents?

Memorize the following:
Data that has a boundary condition cannot be expected to have a Normal distribution, especially if the boundary is the ideal target (flatness)
Normality has nothing to do with stability. Many stable processes are not Normal at all.
Normality has nothing to do with capability - except in a choice of distributional formula to ‘predict defect rate’ (a very flawed but incorrect application that is popular with some companies…)
In general theory for capability indices (Cpk and Pk, etc.) there can be no reliable prediction without stability.
The only difference between Ppk and Cpk (as Miner said) is that Cpk uses the within subgroup variation (sometimes called short term) and Ppk uses the SD of all of the data (within and between; aka long term, except in pre-reduction were the Ppk formula is used for short term estimate as there may not be enough production to have a useful subgrouping scheme)
I thought you can only predict capability indexes (Cpk and Ppk) with a normal distribution? You can transform the data or use another distribution model to calculate Cpk/Ppk such as in minitab but that would not give you as reliable results. Especially when transforming data so that it becomes normal, that reduces the usefulness of the capability indexes.

Regarding one sided specifications:
Per the AIAG manual:
"2.2.11.5 Processes With One-Sided Specifications or Non-Normal Distributions
The organization shall determine with the authorized customer representative alternative acceptance
criteria for processes with one-sided specifications or non-normal distributions.
NOTE: The above mentioned acceptance criteria (2.2.11.3) assume normality and a two-sided
specification (target in the center). When this is not true, using this analysis may result in
unreliable information. These alternate acceptance criteria could require a different type of index
or some method of transformation of the data. The focus should be on understanding the reasons for
the non-normality (e.g., is it stable over time?) and managing variation. Refer to the Statistical
Process Control reference manual
for further guidance."

Regarding stability, I thought that Ppk can still be calculated with an unstable process? As long as the instability is due to predictable causes. See AIAG manual:
"Initial Process Studies. The purpose of the initial process study is to understand the process
variation, not just to achieve a specific index value. When historical data are available or enough
initial data exist to plot a control chart (at least 100 individual samples), Cpk can be calculated
when the process is stable. Otherwise, for processes with known and predictable special causes and
output meeting specifications, Ppk should be used. When not enough data are available (< 100
samples) or there a sources of
variation, contact the authorized customer representative to develop a suitable plan."

I might be interpreting the manual incorrectly and drawing the wrong conclusions.

#### Miner

##### Forum Moderator
Regarding 2.2.11.5, Note the copious use of the words "could" and "should." They are giving you flexibility in how you approach this issue. That means you have the options of transforming the data (I do not recommend), using a nonnormal capability study, or using an alternate method such as a nonparametric capability analysis.

Regarding use of Ppk with an unstable process with "predictable" causes, predictable causes would mean something that varies cyclically like tool wear (sawtooth pattern) or the effects of using an automatic controller that cycles between fixed limits.

#### Bev D

##### Heretical Statistician
Super Moderator
I thought you can only predict capability indexes (Cpk and Ppk) with a normal distribution? You can transform the data or use another distribution model to calculate Cpk/Ppk such as in minitab but that would not give you as reliable results. Especially when transforming data so that it becomes normal, that reduces the usefulness of the capability indexes.
So first you must understand that there are two different worlds here. The one you are directly dealing with is the AIAG interpretation of Quality Engineering. A lot of it is not exactly true but must be done if you are in that industry. The second is the actual real world of quality engineering.

The Cpk/Ppk calculation of the index is always useful and true. The ratio of the process to the tolerance range does tell you something about the process capability - no where near as much as a simple time series run chart will though. It is the prediction of the defect rate that will come from such a process that is stupid. Even with a ‘Normal’ distribution any defect rate calculation is overstated and useless. (This is a topic of a different thread)

As for the statement about the ‘in-stability’ of a process with predictable causes - this is a mis-use of the word stability. AIAG is using stability instead of homogeneity. Processes that are not homogenous can be stable (predictable) and can be capable. A process is homogenous when the largest and dominant component of variation is sequential piece to piece variation. Other components of variation that are frequently dominant are lot to lot (set-up to set-up), material lot to lot, operator to operator, cavity to cavity, machine to machine, time to time, etc. This is resolved statistically by rational subgrouping.