I will try to explain what I am ‘seeing’ in the data for all who are reading this thread and following.

(But first, an aside: I was trained by one of the 13 founders of ASQ, Art Bender. He knew Deming and Juran. Young engineers would take their data to him and ask for help in solving quality problems. He always said, “Those are nice numbers, but there is nothing I can do. Come back when you can show me the data.” Art knew if one could ‘show’ the data, i.e. graph it, they could ‘picture’ what was going on.

So, to the people in this world who are trying to understand a process, please turn off the calculation crank until you have an idea of what you are looking for. Only then let the math assist you in your quest.

Art did this thousands of times for the engineers and practitioners at Delco Remy. I have also done this thousands of times. You are definitely not alone, you have many good fellow quality associates with similar issues. I am always learning from this forum and other sources.)

See my attached worksheet concerning my responses.

Missing data makes the calculations slightly harder in Excel, but that can be overcome with some futzing.

You attached a ‘Process Capability’ or normal distribution chart with your second posting. It displayed (showed) data near the LSL and the USL yet it calculated Cpk’s of 1.5 an 2.0. That just CANNOT happen in reality. I have even seen people argue that they had a Cpk of 1.33 even though they had data points outside the spec limits. That is what math allows. So, either the math is incorrect, which is the question you originally asked, or the data is not what would be expected for the math. I probably did not answer clear enough that the math seemed to be fine with the exception of the missing data in the Excel calculations. To avoid this issue I also looked at subsets that only had complete data as well as all of the subsets.

On the second tab or “Pseudo-Run Chart” worksheet, I rearranged the data to create a ‘Pseudo’ run chart. I call it ‘Pseudo” because I do not if the samples selected, 1-5, were sequentially, every other one, or some other sequence. I also do not know the number of parts between the subgroups. The data is acceptable for a control chart but may distort a run chart. When you view the run chart you can see a pattern in the data. I accentuated this pattern in the second chart with lines. It reminds me of boring processes where the tool is reset and wears quickly. Since there appears to be several ‘resets’ and then a steady decline in the measured value, I would state that there are (at least) two processes being mixed into the final outcome. In both graphs I ignored the out-of control point. In any case, the process is out-of control or not stable. ‘Look’ at the data that you have already collected. Farther math cranking will be incorrect – garbage in: garbage out.

On the third tab or “X-bar-R_Charts” worksheet, I plotted an X-bar and a R chart for the subgroups which contained 5 samples. I did this to keep the D4 value constant. First you can see that X-bar values below the X-barbar line tend to hug the X-barbar line. (This may just be influenced viewing.) But on the Range chart there are definitely three peaks and all of the other range values appear to be getting smaller. Seems to be a pattern and, therefore, another indicator of instability with (at least) two processes being mixed into the final outcome. Why does the range within the subgroups tend to get smaller the longer the process runs?

The fourth tab or “Normal Probability Chart” worksheet is my favorite view of data. Normal probability charts plot the measured values against the Z-score (or sigma) values. Your attached a ‘Process Capability’ or normal distribution chart is on the top of the worksheet. It plots the measured values (in cell groupings) against the data accumulative percentage, (accumulative percentages above 50% are plotted as accumulative percentage – 50%) This is a Gaussian curve. Data should look like the Gaussian or ‘bell’ curve if it is normally distributed. Unfortunately, it is difficult for our eyes to discern this condition on this chart. Another way is to plot the measured values against the Z-score transform of the accumulative percentage value. On this chart normal data will appear as a straight line that one’s eyes can easily discern.

On the third chart where the outlier is ignored, you can see two distinct straight lines. This implies at least 2 process are involved. One has significantly improved capability and is tied somehow to the larger parts being produced (the ‘reset’ process?).

Hope this helps.

Joe