A Balanced view of statistical tests

John Predmore

Involved In Discussions
Many years ago, I discovered what I consider the Grand Unification Theorem of Statistical Tests, which demystified statistics for me. Most statistical tests, including GR&R, SPC, DOE, T-test, weigh the variation between subgroups against what you would expect, from variation within subgroups. I use the mental image of a scale balance, with sources of within-subgroup variation on one side and between-subgroup variation on the other. The only way to conclude there is a difference is if you see the between-group side is significantly heavier than the within-group side. ANOVA and regression are like a balance with multiple pans, but that is harder to visualize.

I will use Gage R&R as an example. A GR&R is a designed experiment in the way you set up the experiment. The mathematics is the same either way; what matters is which sources of variation are lumped on the right side of the scale and which on the left. If you structure your GR&R with subgroups of Operator A versus Operator B, then reproducibility measures inter-operator variability. In an automatic operation situation, you could compare Equipment type A versus Equipment type B, or method A versus method B, then in one sense, reproducibility compares A versus B, even if there are no human operators. If you mark the parts and place them in the device in the same exact orientation every time, then variation due to orientation is not captured in the analysis. If you allow part placement orientation to vary at random, then within-part variation becomes part of within-group variation. If you purposely control different placement orientations, then within-part variation can be isolated in reproducibility. The mental image of a scale helps me visualize how to organize a statistical experiment.

Steve Prevette

Deming Disciple
Staff member
Super Moderator
Curious as to the reason for your post . . . Yes, indeed there are mathematical similarities there. I will though repeat what I heard many times from the Deming folks - SPC is NOT a statistical test but an empirical test. There are some folks that have gone way overboard with the Test of Hypothesis model for SPC and then get wrapped around the axle with "is the data normal". And other statements about levels of confidence (similar to the various capability indexes).

Deming vs. Statistical Hypothesis Testing
Enumerative and Analytic Studies « The W. Edwards Deming Institute Blog

John Predmore

Involved In Discussions
@Steve Prevette, My reason is I had a novel thought to share. Years ago, this mental image helped me understand how statistical tests operate. Today, I observe that many users of statistics want to blindly plug in the numbers and have the spreadsheet or Minitab make the decision, but the proper analysis depends greatly on the source of the data and how the experiment is structured. That is the analogy of consciously placing items on one side of the balance or not paying attention. You can take the analogy as far as it provides valuable insights, but stop when it ceases to make sense. All models are wrong, some are useful.

Top Bottom