Minitab bug in GR&R ANOVA table with interaction?

Bromus

Starting to get Involved
Hi everyone,

I wanted to validate my spreadsheet for the correctness of the GR&R analysis using two-way ANOVA. To do this, I entered my data into Minitab. It turned out that there were differences. At first I thought I had messed up something in the ANOVA table, but I put the same data into R and got the same results as in Excel.

After a little investigation, I concluded that there was a bug in the formulas in Minitab. When calculating the F statistic in Minitab, it divides by MS calculated for the interaction (Operator*Part row in Minitab), when it should be dividing by the variance of residuals (Repeatibility row in Minitab).

This does not generally affect the calculation of the GR&R coefficient (the F value for the interaction is calculated correctly, so decision about interaction significance is correct), but it can lead to incorrect conclusions about the significance of differences between parts and between operators.

Below I provide the calculations in R and in Minitab to show the differences. I will also attach the source file with the data so that you can verify that I am correct:

R results

Code:
              Df Sum Sq Mean Sq  F value   Pr(>F)  
Operator       2  0.011  0.0053    8.856 0.000427 ***
Part           9  6.376  0.7084 1178.819  < 2e-16 ***
Operator*Part 18  0.010  0.0006    0.955 0.520223  
Residuals     60  0.036  0.0006                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Minitab Results

1727099331876.png

As you can see, the values in the F column in the Part and Operator rows are different in both results. However, if the value of F is determined by dividing the MS for the part by the MS for the interaction, the result is the same as in Minitab. The same for the operators. This is a bug.
 

Attachments

Elsmar Forum Sponsor
@Bromus This is not a bug in Minitab. You used a standard 2-way ANOVA, which assumes that the two factors are FIXED factors. In a GRR, the Parts are FIXED, but the Operators are RANDOM. This changes what is used as an error term.

You need to use GLM and specify Operators as RANDOM.

Minitab bug in GR&R ANOVA table with interaction?
 
@Miner, thank you for the answer. This seems to be a solution of my confusion. Can you further explain why the parts factor can be treated as FIXED and the operators are RANDOM?
 
The Operators are treated as RANDOM because in most cases the three operators are selected at random as representative of all the operators that might use the instrument (If those three are the only operators that will ever use the instrument they could then be treated as FIXED).

As for the Parts being treated as FIXED, the parts are fixed to isolate the measurement system variability. While you could say that they were randomly selected as representative of the process, in this case they are being treated as fixed artifacts that are being measured.
 
You will also want to think about when the operators can be ‘fixed’: this occurs when there is a systemic and definitive difference between their measurements. You cannot really know this until you perform the test and analysis. The operator to operator difference will show up in the graph and in the analysis of the difference. If this difference is big enough you will need to correct it. Then of course the operator difference will be small and ’random’. I use ANOM (Analysis of Means) to determine this…

I have seen this occur with simple measuring devices such as calipers and complex devices like single use diagnostic tests…

I once got into a large disagreement with a USDA statistician and my own regulatory statistician over this. We were trying to get a one time use diagnostic test approved by the USDA. Unfortunately the QC inspectors and the person at the USDA had a built in real ‘bias’ or systemic difference in their approach (I can’t get into what the cause was although we eventually corrected it). The argument came down to whether or not the operator should be treated as Random (as explained by Miner) or Fixed. In theory the operator would be treated as random for the first analysis, but when the analysis was completed it was clear that they had a fixed difference among them. The disagreement was eventually resolved when the statisticians realized that eh assumption used for their analysis was wrong. The moral of hte story is that assumptions are not facts - they are requirements. If an assumption is not correct you get a wrong answer. This is one of the things that many analysts ‘step over’. For example, in a hypothesis test of the difference between two means, a p value >5 could mean there is a real difference OR it could mean that one of the many assumptions has been violated; the most common violation is that the underlying process is homogenous…

Hope this doesn’t complicate things for your thought process…
 
The argument came down to whether or not the operator should be treated as Random (as explained by Miner) or Fixed. In theory the operator would be treated as random for the first analysis, but when the analysis was completed it was clear that they had a fixed difference among them.
Hope this doesn’t complicate things for your thought process…

Indeed, it is becoming increasingly complicated. What if, for example, a company has only two or three operators who operate a given measurement system? Will they then be a FIXED factor, since we are testing all of them and there is a strictly defined number of them?

What if I want to perform a GR&R analysis replacing the operators with measurement sockets? And let's say my machine always has 4 of them? Again, the question is, will these sockets be a FIXED or RANDOM factor? :-)
 
Good question!

The answer is no; they are still random. The answer lies not in the selection but in the nature of the variation. There will always be a mathematical difference between operators and equipment…so the question is: is the difference just random variation or is it specifically due to a real difference in the operators or equipment? When you have a homogenous distribution, the variation of the sample means (the average of each of the repeated samples from the distribution) will be related to the total population standard deviation divided by the square root of the sample size, n. (Aka the standard error of the mean) The means will be ‘randomly’ distributed about the grand average ( this is the fundamental basis of all tests of means including ANOVA and for SPC). But if the process is not homogenous, the means will not be distributed per the formula for the standard error of the mean. (Again this is the basis for declaring a result out of control in SPC or declaring that the means are different in tests of means.). In other words the spread of the means is too large to be due to simple random sampling ‘error’.

The operators and equipment are random until proven not random.

Happy thinking
 
Back
Top Bottom