AIAG SAMPLE DATA SET FOR CAPABILITY
Appendix F of the AIAG SPC book contains sample data set and calculations for capability. They generate a series of conclusions based on this data set after some statistical analysis. Unfortunately, there is some missing information as well as unsubstantiated conclusions that will hold the resulting conclusions invalid under certain conditions. One such condition is precision machining.
AIAG starts off by listing some of the assumptions they are going to test. First, they want to ensure: ”The process from which the data come is statistically stable, that is, the normally accepted SPC rules must not be violated.” Second, “The individual measurements from the process data form an approximate normal distribution.” Approximate is the key word here. It is certainly open for interpretation. Third: “A sufficient number of parts must be evaluated in order to capture the variation that is inherent in the process.” They recommend 125 individual parts. Finally: “The specifications are based on customer requirements.”
In order to adequately analyze the data, more assumptions must be stated. The first thing AIAG did was collect and report some data. We can assume that the data taken was in order, and no adjustments, etc., took place during the collection. We also know that the data is for a diameter. What we do not know – and must, in order for the conclusions to be correct – is what the process was, was the measurement error contribution was and how the measurement was taken. For this analysis we will assume the process was precision machining (e.g. grinding or CNC turning). We must also assume this diameter was a full diameter, and the measurement was taken at random locations about the diameter. As such, for each part they collected one of an infinite number of diameters, ignoring error from roundness. It is important to note that the measurement error (variation from ignoring roundness by only taking one point) and gage error (contribution of variation from gage R&R) can mask any underlying process distribution with their predominantly normal distributions.
DETERMINING THE DISTRIBUTION
AIAG plots a histogram of the data and a normal probability plot, and use these visual tools as verification of the “near normality” of the data set. (Figure 1) (1)
Before ever plotting histograms, one would plot a run chart to look for any evidence of dependency, autocorrelation, etc. However, unless the data is taken correctly (i.e., both highest and lowest diameter of the feature recorded), the error will mask such issues. Therefore, it is not of much value with this data set. Then, running curve fitting analysis, such as Distribution Analyzer would be the next recommended step. The results are below (Figure 2).
From this analysis we can see that the best fit distribution is similar to the normal distribution, but the fit is far better, indicating that there is some limitation or risk using the normal curve as a model for this data.
AIAG then prepares an X bar/R chart from the data (Figure 3.)(2) They assume a subgroup size of 5, even though we have no evidence that 5 data points holds any statistical significance. Their results show all points in control, satisfying their assumptions.
However, if you treat the points as individual points and use X-MR chart, you will find points out of control (Figure 4).
At this point I am not going to address the difference between using a chart with arbitrary sample size versus treating the points as the independent samples that they are, but rather consider the impact of measurement error on the resulting data. The data supplies one diameter from each part – one out of an infinite number. There is no attempt to determine or contain the variation that the roundness can create and therefore its impact on the conclusions. We will take 5 consecutive data points and determine their highest and lowest diameter values. In this case, the use of 5 specimens is adequate, since the error from roundness should readily be apparent in 5 consecutive samples. In fact, when precision machining, it is likely to be the only variation detected in 5 consecutive parts, if the gage R&R is adequate. From that, we will plot the X hi/lo-R chart (3) to see what information it can provide (Figure 5).
What we see from this chart is that there is a positive slope of the data, indicating the data may very well not be independent, and it could then possible be a continuous uniform distribution from tool wear. For a positive slope, we would expect the dimension to be an OD, or the process would be deemed out of control. The tool wear rate appears to fall between .001 to .005 per part. The roundness can be estimated to be about .42, or 21% of the tolerance. That is significant, although not apparent from the original AIAG analysis. Again ideally, for a diameter, the high and low data should be taken for each part, not just one diameter. Taking one diameter dramatically reduces the validity of the conclusion by masking the data with measurement error.
CAPABILITY CONCLUSIONS
AIAG chooses to make assumptions based on missing data and weak verification of normality to support their standard Cp and Cpk calculations.
1. They conclude the process is well centered. That conclusion is irrelevant, as the data is not normal and should not be normal if it comes from a machining operation. The only thing that matters is that all of the data falls within 75% of the tolerance.(4) Those control limits are based on the appropriate statistical distribution for the data – the continuous uniform distribution.
2. All indices are relatively high, indicating near-zero nonconformances. That is based on the assumption of normality. For the continuous uniform distribution found in precision machining, as long as the hi/lo values fall within the control limits (75% of the tolerance) there are zero nonconformances. That is because there are no tails on that distribution!
3. Since Cp and Pp are approximately equal, it implies minimal between-subgroup variation. That issue is irrelevant in machining, as we expect between-subgroup variation at a rate that is representative of the tool wear rate.
The real conclusions from the data are:
1. The process is in control, as all hi/lo values fall between 75% of the tolerance
2. The tool wear rate is approximately .001 to .005 per piece. Worst case number of parts per adjustment: 215 parts (assuming no tool breakage and tool wear rate remains constant). More accurate data could be calculated if each part had hi/lo measurements.
3. It is not a normal distribution, nor would it be expected to be. Normal data for a process with tool wear means the process is out of control by excessive measurement, gage or machine variation making the true underlying process variation of tool wear.
Now, it is true that some assumptions had to be made to make these conclusions. But, had the data been taken correctly (hi/lo data for each part), the conclusions would have had much stronger validity.
Click here for a .pdf copy of this blog to share!
(1) Automotive Industry Action Group (AIAG). Statistical Process Control (SPC). Automotive Industry Action Group (AIAG), 2005, p 187.
(2) Ibid, p 188.
(3) Doering, Robert G. CorrectSPC - When 'normal' is not typical (A practical guide for the statistical control of precision machining processes). LaGrange, OH: Tall Order Services, 2007.
(4) Ibid.
Appendix F of the AIAG SPC book contains sample data set and calculations for capability. They generate a series of conclusions based on this data set after some statistical analysis. Unfortunately, there is some missing information as well as unsubstantiated conclusions that will hold the resulting conclusions invalid under certain conditions. One such condition is precision machining.
AIAG starts off by listing some of the assumptions they are going to test. First, they want to ensure: ”The process from which the data come is statistically stable, that is, the normally accepted SPC rules must not be violated.” Second, “The individual measurements from the process data form an approximate normal distribution.” Approximate is the key word here. It is certainly open for interpretation. Third: “A sufficient number of parts must be evaluated in order to capture the variation that is inherent in the process.” They recommend 125 individual parts. Finally: “The specifications are based on customer requirements.”
In order to adequately analyze the data, more assumptions must be stated. The first thing AIAG did was collect and report some data. We can assume that the data taken was in order, and no adjustments, etc., took place during the collection. We also know that the data is for a diameter. What we do not know – and must, in order for the conclusions to be correct – is what the process was, was the measurement error contribution was and how the measurement was taken. For this analysis we will assume the process was precision machining (e.g. grinding or CNC turning). We must also assume this diameter was a full diameter, and the measurement was taken at random locations about the diameter. As such, for each part they collected one of an infinite number of diameters, ignoring error from roundness. It is important to note that the measurement error (variation from ignoring roundness by only taking one point) and gage error (contribution of variation from gage R&R) can mask any underlying process distribution with their predominantly normal distributions.
DETERMINING THE DISTRIBUTION
AIAG plots a histogram of the data and a normal probability plot, and use these visual tools as verification of the “near normality” of the data set. (Figure 1) (1)
Before ever plotting histograms, one would plot a run chart to look for any evidence of dependency, autocorrelation, etc. However, unless the data is taken correctly (i.e., both highest and lowest diameter of the feature recorded), the error will mask such issues. Therefore, it is not of much value with this data set. Then, running curve fitting analysis, such as Distribution Analyzer would be the next recommended step. The results are below (Figure 2).
From this analysis we can see that the best fit distribution is similar to the normal distribution, but the fit is far better, indicating that there is some limitation or risk using the normal curve as a model for this data.
AIAG then prepares an X bar/R chart from the data (Figure 3.)(2) They assume a subgroup size of 5, even though we have no evidence that 5 data points holds any statistical significance. Their results show all points in control, satisfying their assumptions.
However, if you treat the points as individual points and use X-MR chart, you will find points out of control (Figure 4).
At this point I am not going to address the difference between using a chart with arbitrary sample size versus treating the points as the independent samples that they are, but rather consider the impact of measurement error on the resulting data. The data supplies one diameter from each part – one out of an infinite number. There is no attempt to determine or contain the variation that the roundness can create and therefore its impact on the conclusions. We will take 5 consecutive data points and determine their highest and lowest diameter values. In this case, the use of 5 specimens is adequate, since the error from roundness should readily be apparent in 5 consecutive samples. In fact, when precision machining, it is likely to be the only variation detected in 5 consecutive parts, if the gage R&R is adequate. From that, we will plot the X hi/lo-R chart (3) to see what information it can provide (Figure 5).
What we see from this chart is that there is a positive slope of the data, indicating the data may very well not be independent, and it could then possible be a continuous uniform distribution from tool wear. For a positive slope, we would expect the dimension to be an OD, or the process would be deemed out of control. The tool wear rate appears to fall between .001 to .005 per part. The roundness can be estimated to be about .42, or 21% of the tolerance. That is significant, although not apparent from the original AIAG analysis. Again ideally, for a diameter, the high and low data should be taken for each part, not just one diameter. Taking one diameter dramatically reduces the validity of the conclusion by masking the data with measurement error.
CAPABILITY CONCLUSIONS
AIAG chooses to make assumptions based on missing data and weak verification of normality to support their standard Cp and Cpk calculations.
1. They conclude the process is well centered. That conclusion is irrelevant, as the data is not normal and should not be normal if it comes from a machining operation. The only thing that matters is that all of the data falls within 75% of the tolerance.(4) Those control limits are based on the appropriate statistical distribution for the data – the continuous uniform distribution.
2. All indices are relatively high, indicating near-zero nonconformances. That is based on the assumption of normality. For the continuous uniform distribution found in precision machining, as long as the hi/lo values fall within the control limits (75% of the tolerance) there are zero nonconformances. That is because there are no tails on that distribution!
3. Since Cp and Pp are approximately equal, it implies minimal between-subgroup variation. That issue is irrelevant in machining, as we expect between-subgroup variation at a rate that is representative of the tool wear rate.
The real conclusions from the data are:
1. The process is in control, as all hi/lo values fall between 75% of the tolerance
2. The tool wear rate is approximately .001 to .005 per piece. Worst case number of parts per adjustment: 215 parts (assuming no tool breakage and tool wear rate remains constant). More accurate data could be calculated if each part had hi/lo measurements.
3. It is not a normal distribution, nor would it be expected to be. Normal data for a process with tool wear means the process is out of control by excessive measurement, gage or machine variation making the true underlying process variation of tool wear.
Now, it is true that some assumptions had to be made to make these conclusions. But, had the data been taken correctly (hi/lo data for each part), the conclusions would have had much stronger validity.
Click here for a .pdf copy of this blog to share!
(1) Automotive Industry Action Group (AIAG). Statistical Process Control (SPC). Automotive Industry Action Group (AIAG), 2005, p 187.
(2) Ibid, p 188.
(3) Doering, Robert G. CorrectSPC - When 'normal' is not typical (A practical guide for the statistical control of precision machining processes). LaGrange, OH: Tall Order Services, 2007.
(4) Ibid.