View Full Version : Cp/Cpk vs Pp/Ppk (Short term using population sigma) - Formulas to use?
LoudRed 7th January 2002, 04:32 PM OK, Now I'm totally confused about Cp/Cpk, Pp/Ppk.:confused: What I'm messed up on is what formulas to use, and what symbol is for which method.
I know there are the two regular ways to calculate standard deviation. That is, using the "n" denominator for population SD, and "n-1" for sample SD. I also know there is a Sigma (Rbar/d2), and a Sigma-hat. Do either of these correspond to the two regular ways to calculate either Cp/Cpk or Pp/Ppk? If not then what are the formulas.
Also of interest to me concerns individual data and moving range.
Most of the data we've collected are individual data and not multiple readings for sub-group data. Below I've listed individual data for 30 pieces. At the point these data were taken, it was a population. I've attached an EXCEL file with all the data if you want to look at it along with my formulas. Please note that if you do, I only used the "mean - LSL" for my Cpk/Ppk formulas as all the data shows it to be in the bottom half of the spec.
L L Rng
1 79.4
2 79.9 0.5
3 80.7 0.8
4 81.2 0.5
5 81.2 0.0
6 82.5 1.3
7 81.2 1.3
8 80.5 0.7
9 81.5 1.0
10 80.7 0.8
11 80 0.7
12 81 1.0
13 80.3 0.7
14 80.7 0.4
15 80.9 0.2
16 80.7 0.2
17 81 0.3
18 81.7 0.7
19 82.2 0.5
20 81 1.2
21 81 0.0
22 80.8 0.2
23 81 0.2
24 79 2.0
25 83.5 4.5
26 80.2 3.3
27 80.4 0.2
28 79.8 0.6
29 80.2 0.4
30 78 2.2
Min 78.00 0
Max 83.50 4.5
Mean 80.74 0.91
Median 80.75
USL 100
LSL 70
Std. Dev (population) 1.013
Std. Dev (sample) 1.030
Std. Dev. Rbar/d2 0.455
Cp (Rbar/d2) 10.985
Cp (pop) 4.937
Pp 4.854
Cpk (Rbar/d2) 7.865
Cpk (pop) 3.535
Ppk 3.475
First of all, did I calculate using the correct formulas? If so, I don't understand why to use the SD (sample) for Pp/Ppk when that is taken from a population. I've also grouped the data into 15 subsets of 2, and 10 subsets of 3 and gotten the following results. Are these correct also?
If I've used all the correct formulas, then what I got for Cp/Cpk should use the (Rbar/d2) formulas, while thePp/Ppk uses the population SD (ie... "n" in the denominator). Please advise either way. For any assistance.....
Jim Biz 8th January 2002, 10:10 AM Bump-up for out stat gurus:
My memory is really fuzzy here - I do have a software program that figures ppk for me but our customers still rfely on cpk for information so I seldom use it.
Can anyone else give a clear explaination??
Atul Khandekar 8th January 2002, 11:14 AM Cp/Cpk use sigma calculated by the RBar/d2 formula.
For Ppk, sigma is calculated using the formula:
http://www.symphonytech.com/articles/images/1100_2.gif
The difference is the way sigma is calculated.
- Ppk attempts to answer the question "does my current production sample meet specification ?" Process performance indices should only be used when statistical control cannot be evaluated.
-On the other hand, Cpk attempts to answer the question "does my process in the long run meet specification?" Process capability evaluation can only be done after the process is brought into statistical control. The reason is simple: Cpk is a prediction, and one can only predict something that is stable.
You can get more details in the article 'Measuring Your Process Capability' at http://www.symphonytech.com/articles/processcapability.htm
-Atul.
Al Dyer 8th January 2002, 11:32 AM Cpk: Long term using Rbar d2
Ppk: Short term using population sigma
Dave Strouse 8th January 2002, 11:47 AM This raises some of the most troubling questions around.
I'll take a first cut at some of them.
First of all, I'd like to find every author who ever published anything about a "population" and "sample" standard deviation, sit them in an electric chair and gleefully pull the handle. They really don't have a whole lot to do with populations and samples. These are merely statistics used to describe the data.
The reasons there are two main expressions is that one is a "biased estimator" and the other is not. If you want to know what that means, be prepared to study at least one or two semesters of mathematical statistics and even then you may be confused.
DON'T GO THERE!:frust:
Let's get real in the application sense, "How much real practical difference is there between using a standard deviation with n in the denominator versus one with (n-1)" With thirty points you are under 2% in disagreement. Will you send a rocket off course and miss the moon by this "error" either way? NO, and in fact you're daily production won't be affected in any way. My advice in industrial situations is to always use the (n-1) formula.
Another point raised is Cpk versus Ppk. The intent is to show long term "capability" versus short term.
For short term an average dispersion statistic of the "within subgroups" data needs to be used. For individual data "groups" use the standard deviation estimate given by average moving range adjusted by the control chart factors. When you have actual subgroups use standard deviation estimate given by range adjusted by the control chart factors.
For long term a dispersion statistic of all data needs to be used to capture the variability "betwen subgroups". For individual data "groups" or actual groups use the standard deviation estimate given by the "sample" standard deviation.
Davis Bothe has published an 800 plus page book on process capability and the whole concept is confusing. There are as many capability "indices" out there as there are processes it seems.
You should probably ask yourself real hard questions like "What do I intend to do with these indices?"
IMHO they are often used to satisfy artificial reporting requirements. In this sense they allow the clueless to keep the blind informed. :biglaugh:
A proper use of Cpk etc in my opinion would be to measure process improvements as they are instituted. Using these measures in a comparitive sense is perhaps usefull.
One place I worked took Cpk measurements of 5 to 10 different assembly lines in each plant and averaged the Cpks. Trust me that this is mathematically incorrect. Yet, when I tried to make management aware of this, I was told that they knew it was wrong, but upper management wanted a single number to evaluate! :bonk:
BTW, there is a way to combine different processes Cpk into an overall Cpk, but it is not by the simple average.
As far as grouping the data together, what rational basis have you to do this? Again, I think you want to know what you are doing with the answers before you just crunch numbers.
Finally, I don't know what your process is, but I doubt if thirty points is anywhere near enough to estimate long term variability. I'm moving my office to another building so don't have any references available, but I think the common practice is to have a minimum of 100 points to talk about long term capability. Of course this number will depend on if you feel you have captured the variability long term of the process or not.
MarkR 10th January 2002, 01:47 AM If your data comes from a control chart and represents a reasonable amount of production time (I agree that about 100 data points is good) then you always use Cp/Cpk. The d2/Rbar estimator factors out the long term variation by using Rbar. Rbar is based on differences within subgroups, i.e., short term variability.
If your data comes from a short term study, where you went into the shop and gathered consecutive parts from the process, then always use Pp/Ppk. Consecutive parts represent only short term variation.
In summary:
Data from control charts uses Cp/Cpk
Data from short term studies uses Pp/Ppk
AJLenarz 10th January 2002, 12:57 PM I have always enjoyed the "QS" spin on the definations. To me it just made so much sense makes this whole thin cut and dry.
Cpk – The capability index for a stable process. The estimate of sigma is based on within subgroup variation. Cpk can only be calculated when the process is stable.
Ppk – The performance index. The estimate of sigma is based on total variation. Ppk is to be calculated if less than 100 samples or when the process is chronically unstable but meeting the specifications and in a predictable pattern.
:bonk:
LoudRed 11th January 2002, 12:35 PM My thanks to all of you who answered. I appreciate the help.
I sometimes feel that I could really relate to the one statement about the clueless leading the blind..
Again, thanks for the help everybody.
:cool:
KenK 23rd January 2002, 10:42 AM I would tend to see AJLenarz's description of Cpk and Ppk as being the most "usable". In my own words:
Ppk is the actual PERFORMANCE of your process, incorporating all observed variation.
Cpk is CAPABILITY of your process IF all instability was removed (or ignored).
Cp is the best your current process could do IF all instability was removed AND it was centered.
AJ mentioned that Cpk can only be used if the process is stable. I'm not sure what that means. If it means that your control charts shouldn't be "alarming" then I agree, but there will always be some amount of apparent instability in the process - variation between subgroups.
When I say "apparent" I mean that what looks like instability may really be just random between subgroup variation. That's what the control chart is supposed to help separate.
In general I recommend people calculate and report both Cpk AND Ppk since they mean two different things and both provide information.
I also strongly recommend against only reporting only Cpk to customers (unless they specifically ask for just Cpk). In my mind, reporting only Cpk is sort of cheating - making the customer think the process is more capable than it really is by ignoring between-subgroup variation.
By the way, I've never felt comfortable about the terms "short-term variation" and "long-term variation" since it seems they can easily be misunderstood. I prefer the terms "within subgroup variation" and "overall variation".
MINITAB users may have noticed that release 12 used short-term & long-term when referring to the difference variance forms, but release 13 switched to using within and overall instead. I applaud that change.
Rick Goodson 23rd January 2002, 12:39 PM Interesting 'discussion' so far. Let me see if I can stir up the pot.
Why are we interested in process capability? Under the Deming philosophy of never-ending quality improvement it would be to seek methods to continually reduce '6 sigma'. In the automotive arena (read AIAG) they have the same basic philosophy. From the AIAG SPC manual page 1 "First, gathering data and using statistical methods to interpret them are not ends in themselves. The overall aim should be to increase understanding of the reader's processes. It is very easy to become technical experts without realizing any improvements. Increased knowledge should become a basis for action". Now with that said (and I am sure there will be some divergent opinions on the veracity of that statement) the reason for using Cpk or Ppk can be discussed.
The difference between the two indices lies in the denominator, 6 sigma hat sub R-bar/d2 versus 6 sigma hat sub s (reference AIAG SPC page 80). Please note that the term 'hat' is used in both formula. Statistically speaking hat means an estimate therefore it is not based on the whole population only a sample. As Dave Strouse pointed out this population/sample thing is just a red herring that confuses people. It all has to do with how 'sigma' is calculated. Even AIAG trys to confuse people with the terminology. Never the less...
PROCESS CAPABILITY is defined as the 6 sigma range of a process's inherent variation, where sigma is usually estimated by R-bar/d2 and where inherent variation is defined as that portion of process variation due to common causes only (reference page 79 & 80). PROCESS PERFORMANCE is defined as the 6 sigma range of a process's total variation where sigma is usually estimated by s, the sample standard deviation (reference page 79 & 80). So...
Process capability is Cpk, Process performance is Ppk.
Process capability is an idealistic state assuming that all variation is due to common cause only and the process is centered. It is measured by taking variation measurements over TIME from a process that is statistically stable (only common cause variation present).
Process performance is the actual state of the process at some moment in time. In essence a snap shot of the process now. In a minute, hour, day, or week later it probably will be different.
Cpk is an historical record of the processes used as a predictor of the future. Ppk is how the process is actually performing at the time you made the measurements.
Regards,
Rick
Atul Khandekar 23rd January 2002, 12:56 PM Process capability is an idealistic state assuming that all variation is due to common cause only and the process is centered. It is measured by taking variation measurements over TIME from a process that is statistically stable (only common cause variation present).Agree with you completely Rick. However, we do have two different Process Capability indices, Cp and Cpk. Cp just compares spec width with process spread but does not provide any indication of centering. So Cp alone is not enough. With Cpk we have some indication of center shift.
-Atul.
Rick Goodson 23rd January 2002, 01:02 PM Atul,
Good point. Thanks for the clarifiaction. By the way, I looked at the website you referenced. Excellent article by Dr. Mehernosh Kapadia. Thanks for the tip.
Regards,
Rick
Brian Dowsett 2nd April 2002, 06:44 AM Here's a thing.....
Back in my Ford days, Ppk meant Preliminary Process Capability, and was an initial assessment prior to getting the SPC up and running for Cpk.
However, current thinking seems to say that Ppk means "Performance index" and is used to assess actual process performance OVER TIME.
This is confirmed in the QS9000 SPC manual (page 80 in my old copy)and also in Minitab software (which calls Ppk long term) and also in my black belt training.
Formulas are as they always were.
Big difference is that I used to expect Ppk to be bigger than Cpk, now it's the opposite.
Funny old world......
Sam 2nd April 2002, 10:23 AM Brian,
Same here.
I pulled out my old copy of "Continuing Process Control and Process capability improvement" by Ford and found nothing, in the form of an ea quation , relating to Cp, Pp, or Ppk.
Atul Khandekar 2nd April 2002, 10:38 AM I am still under the impression that Cpk (estimated with rbar/d2) is long term and Ppk (...n-1) the short term capability. Minitab's interpretation is the other way. I must say this is confusing.
Is it:
1.Ppk as in preliminary capability: you may take n data points consecutively. Use the (n-1) formula
2.Ppk (minitab version): n data points come from SPC study, over time, Use the (n-1) formula
3. Cpk:n data points come from SPC study, over time. Use subgrouping and estimate with rbar/d2.
:confused:
Help!!
KenK 2nd April 2002, 11:35 PM Forget the "long term" vs "short term" thing. I don't know where it originated (M. Harry?), but using the definitions of Cpk and Ppk from the AIAG SPC Reference Manual, they just don't apply.
Even MINITAB no longer uses long-term & short-term. As of Release 13 they changed their terminology so it now associates Cpk with the descriptor "Potential (Within)" and Ppk with the descriptor "Overall".
The user can calculate Cpk and Ppk at any time, assuming subgroups are involved. Even if subgroups aren't involved, MINITAB calculates the Cpk using a standard deviation estimate based upon a moving range of size two (see the Estimate option).
Think of Ppk as being the actual performance of the process since its estimate of variation includes ALL observed variation - it has absolutely nothing to do with being "preliminary".
Think of the Cpk as the the potential performance, or the potential capability. This is, it is what we'd get for the Ppk IF we removed all instability between subgroups (that is, the control chart line is completely flat).
Think of the Cp as the best potential performance. Not only does it assume the process is completely stable, but it also assumes it is perfectly centered.
Jim Jakosh 4th April 2002, 03:51 PM You are not alone in your confusion. I have to explain that very thing to our folks becsue I created electroin X bar and R charts and Capability study charts and they use different calculations.
Here is the scoop I've found.
Cpk for short term capability using the Sq Rt of the sum of the diff's over (n-1) is the sample deviation when YOU DON'T KNOW THE STABILITY from whence you got the samples. They could have common and special causes at work. This is referred to to as Process performance ( Ppk) by the automotive group.
The same formula over (n) is the Capability of the population
when you measure every part in the population ( not just a sample) and you don't know its STABILITY.
Calulcating the Cpk using Rbar/d2 as the std. dev. assumes you have a STABLE population. It is generally used with an X bar and R chart where you can see the pattern and throw out any unstable points and do the calculation.
We distinguish between them as short term and ongoing capability. If the poplualtion from where you drew the short term samples is stable, the two vaules of Cpk ( wiht (ni10 and R bar/d20 should be very close after you measure 100 or more samples.
Thanks, Jim Jakosh
Chemlab 27th April 2002, 04:52 AM Not exactly the same topic, but sort of. Have a few questions about setting tolerances and control limits.
1. Our customers often put TBD on the print for physical properties of the material until after 25 batches. For each batch, 5 samples are taken and after 25, we are expected to recommend a specification based on Cpk = 1.33. I have a problem with this that I think is legitimate and trying to make the customer understand. First of all, the samples are not 25 groups of 5 from the same batch of material over time. If 100 samples are taken from the same batch, a nice normal curve is seen. However, calculating Cpk does not take into account sub-group to sub-group variation based on Rbar/d2 - subgroups being different batches. Alot of the variation is a mean shift from batch to batch and the statistics are being done on different populations, different batch of material, thru several independent processes that could affect the measurements. Not only are different batches (from which only 5 samples are taken) being lumped into the equation, but several different setups over time. The result is usually a non-normal distribution. The problem is, given a specification, if I were to sample lots of parts from a particular batch, the capability is fine because 'sigma' is tight, but if treated as if the data came from one batch (population) over the course of 25 batches, then on paper it looks terrible if Ppk is calculated, but so wonderful for Cpk that the specifications are made too tight. I understand that batch-to-batch variation needs to be eliminated, I am just saying that I don't think this is exactly right. We had Chrysler come in some time back, and they argued with our Tier 1 customer that statistical specifications could not be set for this application and that only go-no-go limits could be used.
Anyone know how to properly handle the statistics or do I have it all wrong?
2. Much like #1 above, but with regards to calculating control limits. We have a grinder that, once set up, is VERY capable with the variation taking up so very little of the tolerance. I could sample parts from now to eternity and Rbar would very small. On the next setup up after running a different part, the same capability would be seen. Using R-bar/d2 from the various setups, after the control limits are calculated, they are so tight that the operator has a very hard time just setting up between these limits because the formula doesn't not take into account different setups, which are different populations. However, we can't make new control charts for each new setup because of the short time involved. I've briefly looked at short-term SPC. Is this what I need, something else, or am I totally misunderstanding something?
Thanks.
sxbalasu 4th November 2002, 06:10 PM As far as my understanding goes CpK is process capability index we get from the data we collect from our control charts on daily basis. Also we use Rbar/d2 to get the sigma value. Basically Ppk is calculated when we do a PPAP. That is when we run 300 components with production tooling and set up and we check random 100 parts from that and calculate sigma with n-1 and this show the preliminary process capability index. What i did when i did PPAP was i got the Pp and Ppk values for the critical characteristics(variable) and if they were below 2.00 then i use a Xbar R control chart and monitor Cp and Cpk on a monthly basis from the data i got from these charts. This also can be used to show continuous improvements onthe process capabilities.
Hope this clears somebody's doubt
Tim K 6th November 2002, 02:29 PM :bigwave:
I agree with the above discussions clarifying Ppk and Cpk.
In the orignal post that started this, the author stated that the data was for individual readings. Since this is the case, Cpk should not have been calculated because it represents data collected as a sub-group.
I am attaching a Capability Study Form I developed. Cpk is calculated from the sigma using R-bar/d2. Ppk is Calculated from the sigma using (n-1). Cpm is calculated from the sigma using (n-1), and it is based off a Taget Value instead of the mean of the tolerance.
The attached Capability Study Form may be used for:
- Bilateral or Unilateral tolerances
- Data from Sub-groups or Individuals
- Machine Capability using a specified Target Value
The above may be done in various combinations. The form includes several built-in demonstrations.
Since I already posted this at another site, I guess you need to follow this link:
http://Elsmar.com/Forums/showthread.php?s=&threadid=1994&pagenumber=3
Hope this is helpful :)
raknaja 23rd April 2006, 11:17 AM Hello Everbody ^_^
Let'me introduce myself.I am Yutty (Come form Thailand )
I really to know about new knowledge that all of you have shared in this forum.I would like to thank you to all of you.
New Learner :o
Yutty
Miner 24th April 2006, 09:13 AM Hello Everbody ^_^
Let'me introduce myself.I am Yutty (Come form Thailand )
I really to know about new knowledge that all of you have shared in this forum.I would like to thank you to all of you.
New Learner :o
Yutty
Welcome to the cove Yutty. Did you have a specific question in this thread, or simply introducing yourself?
|
|