# Capability calculation of surface profile

#### Jim Wynne

There is nothing inherently wrong with using only a few measurements to calculate a Cpk value
There is something inherently wrong with calculating Cpk with only a few measurements--see my earlier post in this thread. Even using the standard rules, Cpk is not a very useful statistic, but if you throw out the rules, the value becomes less than zero.

Personally, I get the impression that you and your customer did not properly specify the expected outcome of the study. Thus, I expect that your customer is not firm in statistics...
I suspect that anyone who thinks that doing a capability study of 30 pieces is OK is "...not firm in statistics."

Last edited:

#### Semoi

##### Involved In Discussions
My statistics prof used to say that a statistics without an uncertainty is as useful as a measurement result without an unit -- and yes, I know that there exists dimensionless physical quantities, but this was not what he referred to
My argument was along this line of though. Thus, I don't believe it is fair to quote only the first part of my sentence and omit the second half. In addition, if you have any mathematical argument, why the minimal number of parts should be N=??? and which is not related to an uncertainty, coverage interval, or confidence interval, I would be very interested.
My second request is: Is it possible that you provide a reference/link to "the standard rules" for Cpk calculations? I was always under the impression that the rules differ between industries (automotive, medical devices, semi-conductor, ...). I would love to study-up and learn. After all, this is why I am on this website.

#### Jim Wynne

My statistics prof used to say that a statistics without an uncertainty is as useful as a measurement result without an unit -- and yes, I know that there exists dimensionless physical quantities, but this was not what he referred to
My argument was along this line of though. Thus, I don't believe it is fair to quote only the first part of my sentence and omit the second half. In addition, if you have any mathematical argument, why the minimal number of parts should be N=??? and which is not related to an uncertainty, coverage interval, or confidence interval, I would be very interested.
My second request is: Is it possible that you provide a reference/link to "the standard rules" for Cpk calculations? I was always under the impression that the rules differ between industries (automotive, medical devices, semi-conductor, ...). I would love to study-up and learn. After all, this is why I am on this website.
Have a look at this, from the NIST/Sematech Handbook of Industrial Statistics.

#### Miner

##### Forum Moderator
There are some complexities in this scenario that are easily misunderstood that I will try to clarify. There are several sources of variation here and the problem is to identify which source should be included in the capability study and which should be separated.

Sources of variation:
• Sw - Within part variation in thickness (P1- P5)
• Sb1 - Part to part variation (within one cavity of mold)
• Sb2 - Part to part variation (between cavity to cavity of mold)
If you were to collect 100 parts (rational subgroups in time sequence in statistical control) and measure them in random locations, you would include all of these sources in your study.

Sobserved ^2 = Sw ^2 + Sb1 ^2 + Sb2 ^2

The question to be answered is whether to include all of these sources of variation or whether to separate them out. My advice is to start by including all of them. If you can demonstrate capability with all sources included, you can stop. If you cannot demonstrate capability, you then perform special studies to isolate the sources of variation so you can address them. This would best be done by a DOE with each of these three sources of variation as factors in the experiment.

NOTE: Measurement system variation is yet another source of variation, but should have been addressed prior to your capability study.

#### Tahirawan77

##### Involved In Discussions
Hi Miner,

This is very good suggestion and this approach is more practical and easy to discuss with the customer.
In my case there is only one cavity / mold so the major sources of variation can be

• Sw - Within part variation in thickness (P1- P5)
• Sb1 - Part to part variation (within one cavity of mold)
And i guess I need to first compare 'within part variation' to 'part-to-part variation' and ideally the within part standard deviation should be less than the 'part-to-part standard deviation' (Grand std. deviation). But if not then it indicates the 'special/assignable cause' lies within a specific individual part and not to process it self?

Also then i can plot the 'average thickness of within part variation thickness (P1-P5)' in a IM chart to see if the process is stable over time. And then plot the std. deviation of 'within part variation thickness (P1-P5)' to R chart to check for process deviation over time ?

And comments on the above approach?

#### Miner

##### Forum Moderator
That is definitely a valid approach. See Dr. Wheeler's article on the Three-way chart for guidance.

#### Matt Savage

Trusted Information Resource
You stated that the thickness should be uniform across the surface profile. Assuming your subgroup size is 5, the range chart will show you if the thickness is uniform or not.

What question are you trying to answer? It appears like you do not need to know how P1 is doing compared to P2, and the others. If you want to know this, then you will want to do a capability study for all P1 values, same for P2, etc.

Assuming the process is in control, then proceed to read, otherwise stop, and get the process in control. You have 30 samples, with a subgroup size of 5. This is enough data. If you can get more data, do so. But if the process is in statistical control, the results will not change much.

There is one, and only one way to calculate the variability factor (sigma) that is used with Cpk. That way uses the average range, R-bar, divided by the constant, d2. If you are calculating the variability by some other means, you are not calculating Cpk correctly. (Most likely, you would be calculating Ppk.)
Getting back to the original question; “I want to calculate the capability of my process.” You should have one Cpk value based on the (30 * 5) data values.

#### Semoi

##### Involved In Discussions
There is one, and only one way to calculate the variability factor (sigma) that is used with Cpk. That way uses the average range, R-bar, divided by the constant, d2. If you are calculating the variability by some other means, you are not calculating Cpk correctly. (Most likely, you would be calculating Ppk.)
I know that such clear statements are comforting. However, I wonder how you are able to be so confident. I expect that you reference either a manual of a statistics program (Minitab, SPS, etc.) or a document for a specific industry (e.g. AIAG). However, I have never read a scientific paper where such a statement was backed-up. In contrast, the papers I read are most often concluding that the R/d2 result is nearly as good as the s/c4 result, if the sample size is small and the data follows a normal distribution. However, once the sample size increases or the data deviates from the normal distribution the R/d2 result picks up a bias quicker than the s/c4 calculation.

In order to check this, I used R and performed a quick cross check:
* sample size: 10 and 20 (these are two simulations)
* number of loops: 2000

Both times the s/c4 is a better estimate for the standard deviation of the population.

#### Semoi

##### Involved In Discussions
the R/d2 result picks up a bias quicker than the s/c4 calculation
This is not true, because the bias is a concept for n --> infinity. What I meant is that the standard deviation of the estimator R/d2 is larger than for the estimator s/c4. As a formula: Var[R/d2] > Var[s/c4].

#### Matt Savage

Trusted Information Resource
Before the 80’s when personal computers were not as prevalent, the only practical way to calculate the dispersion statistic (sigma) was to use the subgroup variably defined as R-bar (or MR-bar for subgroup size of 1) divided by d2. It is indisputable that calculating the dispersion statistic using other methods was not practical without the aid of computers or calculators. Although hand-held calculators were available, they were not widely then.

The pre-cursor manual to the AIAG’s (Automotive Industry Action group) Statistical Process Control reference manual was Ford’s “Continuing Process Control and Process Capability Improvement” manual. As of the December 1987 printing, Cpk, Cp, and Cr were the only indices in Ford’s booklet. These capability statistics were calculated using the “Process Standard Deviation (sigma hat) = R-bar/d2.” (p. 24-26a.)

AIAG Second Edition (1992) shows Ppk, Pp, and Pr which use the method of variation that they refer to as “Total Process Variation” calculated using SQRT [SUM(Xi x-Bar)^2 / (n – 1)]. AIAG is also consistent with Ford’s manual for calculating Cpk, Cp, and Cr.

In my opinion, AIAG became the standard to follow for Process Capability and Process Performance. Many reputable textbook show capability calculations (Cpk, Cp, and Cr) that use “process standard deviation” calculated using R-bar/d2 or S-bar/c4.

Some references:
Understanding Statistical Process Control - Wheeler
Tools and Methods for the Improvement of Quality - Gitlow, Gitlow, Oppenheim, and Oppenheim,
Quality Control Handbook, Third Edition – Juran (9-19)