# Type 1 study failed

#### Jayfaas

##### Involved In Discussions
Hello. I have a pitch diameter gauge here with a 0.001mm resolution and a ± 0.010mm tolerance. The value for the type 1 study was 48.002. I got 48.002 6 times and 48.003 19 times for the Type 1 study. I have met the resolution requirement (5% of total tolerance) and met the Cg requirement. The %EV was 13.08% and the Cgk was only 0.95. According to our internal requirements, I need it to be atleast 1.00. Any idea why this would fail only having a 1µm variation?

#### Semoi

##### Involved In Discussions

The Cgk value consists of two parts:
1. the Cg value, which is a measure of the precision (repeatability). In your example you have Cg = 1.529
2. the bias contribution |xbar - xRef|/(3*SD). In your example this is 0.581
These second part is subtracted from the first part. Thus, we obtain 1.529-0.581 = 0.948.

#### Jayfaas

##### Involved In Discussions
Right but why with such a small variation? Is it because I hit one micron above more than the actual reading? I ended up going with a 0.0001 mm resolution and the study passed but we only use it in 0.001 mm resolution on our production floor so it seems like cheating I do the study in the higher resolution to get it to pass, but use it in a lower

#### Bev D

##### Heretical Statistician
Super Moderator
Well because %EV = EV/T is Hokey Pokey math. Measurement error, part variation and observed variation follow teh rules of vector math. (Thinks right triangle and the legs and the hypotenuse). The tolerance range lies on the same vector as the observed variation. So in real life, the observed variation = square root of (measurement variation squared + product variation squared). THINK about it: measurement error cannot be subtracted from the observed variation to get the amount of real product variation, because you are dealing with standard deviations…EV/T will over-estimate the contribution of measurement error. And it won’t make sense because it is mathematically wrong.

#### Jayfaas

##### Involved In Discussions
Is there another method to use? Seems bad if the math is broken to be using it for studies, especially if it can fail for having such a low variation on a single part. What could be done to increase the chances of this passing? This is one thing about MSAs that I still fail to grasp. Had the same issue with a tape measure, and with a grease scale with 0.1g resolution. Used a 200g weight that came with it for the Type 1 and had 0 variation, yet it fails.

#### Semoi

##### Involved In Discussions
It certainly makes sense to check the assumptions of the type 1 analysis. Namely, that the random error follows a normal distribution. This is certainly not the case, if we obtain either zero or one as error -- the residual is discrete and not continuous. Also note that the standard check to test normality -- the Anderson Darling hypothesis test -- is sensitive to discreteness. Thus, this tests is too conservative and you should probably use something like the Jarque-Bera test, but increase the critical p-value to 10% - 20%. For your dataset the p-value is 4%.

What can you do to ensure a successful qualification?
I like to get a feeling for the numbers, which I try to achieve. To do so I perform simple simulations. Using your results, i.e. SD=0.43 and BIAS=0.7 I run a simulation by drawing random numbers of a normal distribution. This result clearly shows that in order to achieve a successful qualification result (with a success probability >= 80%), we need too many measurements (approx. N>4000). Usually, I would assume a contaminated normal distribution, however, this further increases the number of runs. Also, I recommend to perform a sensitivity analysis -- i.e. how sensitive is the sample size to the input parameters (SD and BIAS). In your case it is very sensitive. However, if I add your final resolution of 1um, the needed sample size exceeds 5000 -- note that I assume a success probability of 80%.

Are there other concepts?
The concept, which you currently use for validating a gauge, is common in many industries. There exists other concepts: E.g. Donald Wheeler proposed (at least) two concepts: The first one focusses on the probability to detect trends using SPC charts. A second one assumes that the manufacturing process is well-centered within the specification limits and then estimates the probability that a part, which is measured to be "out of spec", is actually within specification. I like this idea. Unfortunately, I am unable to use it, because our specification limits are always extremely tide and our manufacturing processes are not stable -- although many people tried to stabilise the manufacturing processes. Thus, we need the measurement result to adjust the manufacturing process.

#### Bev D

##### Heretical Statistician
Super Moderator
Is the part you are measuring a real part or a reference part?

One thing to remember about SD estimates is that when you have very little variation you will get an over-estimate of the SD due to the sparse variation.

At this point I would recommend doing a full MSA with about 30 parts, 2X measurements of each part. I’m not a fan of type 1 studies as there is no part variation and when you have a very high resolution gage, it can result in what you have: a lot of math and very little insight…and that is the point of any MSA - to understand your measurement system in relation to the thing you are measuring. It’s not about passing some ‘bright line’ with mathematical jabberwocky. Take a look at my resource “Statistical Alchemy” it might give you some ideas…

#### Jayfaas

##### Involved In Discussions
It certainly makes sense to check the assumptions of the type 1 analysis. Namely, that the random error follows a normal distribution. This is certainly not the case, if we obtain either zero or one as error -- the residual is discrete and not continuous. Also note that the standard check to test normality -- the Anderson Darling hypothesis test -- is sensitive to discreteness. Thus, this tests is too conservative and you should probably use something like the Jarque-Bera test, but increase the critical p-value to 10% - 20%. For your dataset the p-value is 4%.

What can you do to ensure a successful qualification?
I like to get a feeling for the numbers, which I try to achieve. To do so I perform simple simulations. Using your results, i.e. SD=0.43 and BIAS=0.7 I run a simulation by drawing random numbers of a normal distribution. This result clearly shows that in order to achieve a successful qualification result (with a success probability >= 80%), we need too many measurements (approx. N>4000). Usually, I would assume a contaminated normal distribution, however, this further increases the number of runs. Also, I recommend to perform a sensitivity analysis -- i.e. how sensitive is the sample size to the input parameters (SD and BIAS). In your case it is very sensitive. However, if I add your final resolution of 1um, the needed sample size exceeds 5000 -- note that I assume a success probability of 80%.

Are there other concepts?
The concept, which you currently use for validating a gauge, is common in many industries. There exists other concepts: E.g. Donald Wheeler proposed (at least) two concepts: The first one focusses on the probability to detect trends using SPC charts. A second one assumes that the manufacturing process is well-centered within the specification limits and then estimates the probability that a part, which is measured to be "out of spec", is actually within specification. I like this idea. Unfortunately, I am unable to use it, because our specification limits are always extremely tide and our manufacturing processes are not stable -- although many people tried to stabilise the manufacturing processes. Thus, we need the measurement result to adjust the manufacturing process.
I do run my own made up tests occasionally. I am told our requirements for studies are Type 1, Thpe 2/3, and linearity. Long term can be MSA studies or SPC(which unfortunately is not well implemented yet). In just wondering if there are any settings within the study that can be changed to different methods to give a better reasonable but still compliant result. Obviously I cant do 5000 samples. Whats really unfortunate is that im finding these issues with the simplest measurements and methods. The scales we use are 0.1 resolution but our tolerances are usually +6g or +\- 10g. Doing a Type 1 on a simple 200g weight is likely not going to show any variation. With the tape measures, our tolerances are usually +\- 5-10mm, so doing a Type 1 study using a standard is likely going to yield 0 variation, but the tolerance is so big that we dont need to have a scale with 0.01g resolution or a tape measure with 0.5mm resolutions. What do you do?

#### Jayfaas

##### Involved In Discussions
Is the part you are measuring a real part or a reference part?

One thing to remember about SD estimates is that when you have very little variation you will get an over-estimate of the SD due to the sparse variation.

At this point I would recommend doing a full MSA with about 30 parts, 2X measurements of each part. I’m not a fan of type 1 studies as there is no part variation and when you have a very high resolution gage, it can result in what you have: a lot of math and very little insight…and that is the point of any MSA - to understand your measurement system in relation to the thing you are measuring. It’s not about passing some ‘bright line’ with mathematical jabberwocky. Take a look at my resource “Statistical Alchemy” it might give you some ideas…
We are usually measuring reference standards, such as a 200g reference weight, or a gauge block when doing the tape measure study, or using a setting master ring to do a Type 1 study on a snap gauge. We do Type 2 studies as well, but are also required by our internal documents to do Type 1 and 2 studies as well.