The Elsmar Cove Wiki More Free Files The Elsmar Cove Forums Discussion Thread Index Post Attachments Listing Failure Modes Services and Solutions to Problems Elsmar cove Forums Main Page Elsmar Cove Home Page
Google
  Web Elsmar.com
*Please be aware that SOME RECENT forum threads may not yet be indexed by Google.

View Full Version : Standard Deviation vs. Estimated Sigma


Gary E MacLean
25th June 2007, 09:48 PM
How can I tell when I should use the Standard Deviation formula method (sum of squares) for capability determination and when I should use the estimated sigma (Rbar / d2) method? I have used both of them for a sampling of 125 readings and I get different answers, not by much but indeed different. Is there a rule of thumb to tell you when to sort data into subgroups of five samples each and when to review it as a full population?

Stijloor
25th June 2007, 11:31 PM
How can I tell when I should use the Standard Deviation formula method (sum of squares) for capability determination and when I should use the estimated sigma (Rbar / d2) method? I have used both of them for a sampling of 125 readings and I get different answers, not by much but indeed different. Is there a rule of thumb to tell you when to sort data into subgroups of five samples each and when to review it as a full population?

Hello Gary,

Welcome to The Cove!

I'm sure that other Fellow Covers will respond to your post.
In the mean time, you may want to do a search on Cpk and Ppk.
The explanation associated with these two capability indices covers the differences between the two "sigma's." At least, it's a start.

Darius
26th June 2007, 10:58 AM
What's the purpouse of such estimate? :confused:, Stijloor suppoused that you want to evaluate capability/performace indicators, but I am not sure.

As Don Wheeler said in one of his books (more or less), the variation estimate has it's pourpouse on wich is usefull. But remember that any estimate is good as far as it's usefullness.

Gary E MacLean
26th June 2007, 12:55 PM
I am trying to evaluate the capability of a particular dimension on a small stamping for my customer's PPAP submission. The rate of the press is about 2500 per hour. We measure the part on-line at five pieces every two hours. Following the press it goes through a washer, then a stress relief and finally a shot peen process.

I pull 125 samples from a lot size of 3000 parts that have just went through the shot peen process. Then I measure them on a comparator, collect all data and run statistics on the data. The problem I have is that one individual in this company uses Excel which uses sum of squares and I use five piece sub-groups with Rbar over d2 as the foundation of my calculations. Our final determinations as far as Cpk goes are different, and they would be.

My question is; which method is considered most appropriate for PPAP capability on an existing product which already has interim PPAP approval and is shipping at the rate of about 50,000 per week? Anyone's input would be greatly appreciated.

Tim Folkerts
26th June 2007, 03:03 PM
Gary,

As I understand the situation, after finishing the whole process on the 3000 parts, you pull 125 at random from that batch. Presumably the parts are well mixed at this point, so that you have no idea when any given part was made.

If this is the case, then the idea of "rational subgroups" is lost. A control chart ideally should group parts that belong together, and plot them in time sequence. With bulk parts and bulk processing, that can often be a problem. Once you are in production, you might just pull 5 parts or so from such a 3000 piece batch and that would be one point on a control chart.

For this initial testing, you want more data from this one batch. The 125 pieces should give a pretty good indication of the capability, but I think both of your methods for calculation will in effect give Ppk.

A calculation based on the STDEV function in Excel (or similar calculations in other software) officially gives Ppk. When you have rational subgroups, then using R-bar/d2 (or the equivalent calculation for S-bar) will estimate the standard deviation within the subgroups. By randomly selecting parts from the 125 pieces, you are effectively eliminating any systematic variation between subgroups, so Rbar/d2 should be the same as the overall st dev. This means you should get (approximately) the same value for Cpk and Ppk. If they are more than few % different (assuming I understand the situation), then something would seem to be wrong in the calculations somewhere.


I can't really answer the final question about which is more appropriate. I approach these questions mostly from an academic perspective and don't not common practice for specific industries very well.

Tim F

howste
26th June 2007, 04:40 PM
My initial thought is this: R-bar/d2 bases its estimate on only 2 of the 5 samples in each subgroup (the high and low). Because of this, the estimate will will be less descriptive of the population variation than if you use the sample standard deviation.

Gary E MacLean
26th June 2007, 04:51 PM
Sorry, I did neglect to explain that I take the parts from the exit chute on the stamping machine and mark them sequentially, with a black marker, in order of production. Then I scribe the numbers on them so the numbers do not get lost in the shot peen process. When I remove my parts from the 3000 shot peen samples I do know their order of production from the press.

The parts go immediately to a washing operation (lose order of sequence) then a stress relief (another bulk process with no sequential control) then to shot peen. However, they carry the numbers I scribed into them all the way through the process.

Let me throw one more variable into the mix. I do this all by hand. By that I mean a four function calculator and a piece of paper. Now does that little fact sway any of you toward 25 sub-groups of five samples each?

Also, the PPAP manual defines the capability study in terms of 25 sub-groups. This is perhaps the one most influential fact in my decision. I just researched a little deeper into PPAP and I may have found my answer anyway. It says something to the effect

1) Use Cpk (with Rbar/d2) when there is enough historical or current data and the process is in control.
2) Use Ppk (with root mean square equation) if the process is chronically unstable yet meeting specification.

Thanks again everyone.




Thank you all for the very useful input.

Gary M

howste
26th June 2007, 05:03 PM
Do you want to borrow my slide rule? :lol:

Is there a reason you're not using software or a spreadsheet to calculate? I've done the calculations by hand many times, and not only does it take a huge amount of time to calculate, but it's also much more likely to have errors in results.

Gary E MacLean
26th June 2007, 05:12 PM
Thanks for the offer howste but my four function is probably a bit faster. several reasons I do it by hand; small company, slight resources, no SPC software, don't like the rigidity of excel, I like the hands on feel, keeps me busy :lol: yeah, right!

Tim Folkerts
27th June 2007, 10:14 AM
...they carry the numbers I scribed into them all the way through the process.

Then I take back some of what I said - at least as it applied to you.

Using subgroups DOES make sense. Cpk MIGHT be better than Ppk (if there is variation during the run.

I do this all by hand. By that I mean a four function calculator and a piece of paper. If that is the case, then using subgroups and ranges makes sense, since it is easier. I would strongly recommend switching to a computer. Persumably you record each data point anyway, so why not use a computer? Setting up equations in Excel to find the ranges and then to calcualte R-bar/d2 is simple. This avoids the possibility of calculation errors along the way.


Tim F