@Miner: The posted dataset is fake data. I generated it to prove the point that if the within-subgroup data is anti-correlated the SD_within is
expected to be smaller than SD_global -- assuming that the subgroups are independent. The fact that we get many subgroup averages within +/-1 Sigma is a direct consequence of this anti-correlation. However, if we see this as a short-coming, it is easily possible to adapt the generation process to obtain the following dataset:
The (overall) within-subgroup standard deviation is 1.60, and the global standard deviation is 1.41. Also 11 of the 20 subgroup SD exceed the global SD.
PpCpk is always less than Cp/Cpk. Just look at the damn formula. The only way that Ppk is greater than Cpk is a rare case of irrational subgrouping and sampling frequency.
Saying that the performance indices exceed the capability indices due to their mathematical formulas, and then saying that there are "rare cases" of exceptions is a rather ... let's say weird ... statement. Mathematics does not work this way. Furthermore, expressions such as "Var[total] = Var[within] + Var[between]" do not apply to the case I described.
Bev, this is not the first time you choose a rather harsh statement. However, I am happy to repeat myself: I am here to learn. So please, post the formulas and the assumptions -- this would certainly be helpful. However, what is not (!) helpful is to post a reference (probably to a Donald Wheeler paper) stating that the subject is well explained there, and that you don't want to repeating it here. Please make your point clear.
Bev, I agree with you, if you said that in 99% of all cases the problem of Cpk > Ppk is due to
irrational subgrouping and sampling frequency.
However, anti-correlated subgroup data points are rare in industry, and if the anti-correlation is mild enough, we won't be able to detect it. E.g. in my second dataset we have corr=-16.5%, but the standard correlation test yields p=10.1%. Hence, it is not detectable.
Also re-reading the original post I think the OP is saying that they are calculating Cpk for every subgroup?
Excellent point. I missed that. So let's go back to the original post:
Intuitively this does not make sense to me but wondering if this is possible, at least mathematically?
I would like to understand how Ppk could be lower than the lowest Cpk.
I gave you one "reason" in my first post: If the within-subgroup data points are anti-correlated, we expect that SD_within is inflated. Hence, Cpk is expected to exceed Ppk -- if the between sample variance is "small". Furthermore, if it is possible for the overall Cpk to exceed the overall Ppk, then it is also possible for each (individual subgroup) Cpk to exceed the overall Ppk. I even generated a dataset. However, the comparison between the "individual subgroup Cpk" values and the overall Ppk hardly makes sense. So, while there is nothing mathematical, which prevents it from happening, Cpk_each > Ppk_overall is very unusual. Thus, you should investigate such a process.