To expand on last comment I made: every test should actually test the function it is intended to assess. That doesn’t always happen. This step is a critical part of test method validation. It’s not the only one, but it is essential. A real life example that I’ve used here before: the inner diameter of a ‘tube’ was being measured by a 3 point ID tool. Even though every tube passed there were failures in the field. We performed a repeatability MSA suing 30 parts measured twice. It failed. We then used an instrument that utilized air pressure. It passed perfectly. We were suspicious. So we measured the parts using a CMM, measuring the diameter multiple times around the ID. The average diameter from the CMM also passed teh MSA as we suspected it would. BUT that isn’t why we used the CMM; we wanted to map the shape of the inner diameter. It showed Us that teh ID was tri-lobular. This is why the air gage passed aNPD the 3 point device didn’t. The measured ID from the 3 point device depended on the orientation of the 3 points in the ID. The machinists ‘knew‘ this by the way and would fiddle with the measurement device to get a passing answer. (I am NOT saying this was their fault at all by the way). The tri-lobular pattern was created because we were using a 3 jaw chuck to hold an machine the parts. Now this might have been OK depending on the function of the tube. IF it was to direct the flow of air or water the area of the ID then the air gage would have been appropriate. The difference between the smallest ID and Largest ID in the trilobular pattern may even have been small enough to allow for using the 3 point gage taking multiple measurements and using the average. However, the ‘tube’ was really an axle that had a press fit bearing installed in it. The minimum ID was the critical feature, not the area or average of the opening.
There is another aspect of your question that needs to be addressed: what was your study design? were your two set of data from two sets of DIFFERENT parts or the same parts? And what kind variation does this process experience? Depending on this answer there may be a different approach to validate the repeatability of the measurement process.