MSA for ATE (Automatic Test Equipment Embedded Software)

Rumbero65

Starting to get Involved
Hello we manufacture ATE (Hardware & Test Software) mainly for automotive and would like to embed MSA in our Software to be used both internally (before shipping testers to customers) and at customer site for monitoring the ATE quality. Before asking I've read tons of post here but I need some confirmation (if possible) from you experts about the correct approach.

First of all let me describe the ATEs:
each ATE can be made of 1 or more (up to 40) test bay.

I am considering to embed in our SW (preliminary) Gage Study type-1 and then a GR&R. Well let's come to the point; ATE typical aspects are:
  • No Operator
  • Several Test Bays to be correlated.
Taking into account these two aspects do you have an advice on how to perform MSA?

I've read your posts but I'm a bit confused so I need just some confirmation. If I correctly got the point I have two main choices:

1) GR&R Replacing Operator(s) with Test Bays:
some expert suggested to use ANOVA on Minitab: Why ANOVA and not Averages and Ranges?
Does it make sense to have a large (let's suppose 30) number of test Bays?

2) We could avoid considering Operator(s) contribution:
in this case we could perform Gage Study type-1 (Golden sample teste 50times) over each test Bay and then may by ANOM (analysis of Means) in order to check the test Bays correlation. Does it make sense? Should we also consider Cp/Cpk?

Thank you for reading this post,

Giovanni
 

Welshwizard

Involved In Discussions
Hello Rumbero65,

I am not familiar with Hardware and Test Software so forgive me if I go down the wrong track here but l will try to help.

Firstly, I am assuming that you are not being compelled to use any particular type of approach to the test? I ask because some folks start down a route and then may get audited down the line by a prime customer or organisation who expect something different.

My assumption from this point is that you are fairly flexible as to the options for the approach to this problem

Each Test Bay checks an output which can be referenced to some reference standard and these outputs across bays should not be detectibly different from each other. As you have mentioned a type 1 Gage study I assume that for each bay there is only one output of interest.

My approach would be to run a consistency study for one bay, the objective of this would be to understand whether or not the output is consistent before comparing the different bays to each other. The data gathering process would be the same as your Gage Type 1 Study except that you would input the data in time series order into an Individual moving Range Chart instead of the standard run chart.

At the basic level, any observations that beach the upper control limit indicate a lack of consistency which would need to be addressed before the summary statistics could be used. The achievement of consistency is the foundation level and an achievement in itself, don't forget to record anything that you may feel would contribute to the variation you see.

Now you have confidence that you can achieve consistency you can compute the Est standard deviation = Average moving Range/1.128 (n=2), you can compute the consumption of tolerance (min 96% confidence) =2.7 x Est standard deviation/tolerance band, if you a require a min 99.9%
confidence use 5.4 as the multiplier as opposed to 2.7.

To compare bays carry out the consistency check for each bay, ensure you have consistency, then carry out an equivalency study across all bays. For more details on equivalency see Dr Don Wheelers articles in Quality Digest, it's a simple test and you can check for equivalence of the standard deviation and bias if you wish using your reference sample.

This procedure I have outlined can be ran in its entirety in spcforexcel software, I'm not aware of its inclusion in other softwares but that doesn't mean it isn't there.

For more information on measurement consistency please refer to Dr Wheelers book EMPIII, its an invaluable resource.

Hope this helps!
 

Miner

Forum Moderator
Leader
Admin
I've read your posts but I'm a bit confused so I need just some confirmation. If I correctly got the point I have two main choices:

1) GR&R Replacing Operator(s) with Test Bays:
some expert suggested to use ANOVA on Minitab: Why ANOVA and not Averages and Ranges?
Does it make sense to have a large (let's suppose 30) number of test Bays?

2) We could avoid considering Operator(s) contribution:
in this case we could perform Gage Study type-1 (Golden sample teste 50times) over each test Bay and then may by ANOM (analysis of Means) in order to check the test Bays correlation. Does it make sense? Should we also consider Cp/Cpk?

For background, I have been working with MSA for over 30 years and have dealt with ATEs for over 20 years. ATEs can fall into four different scenarios depending on the number of stations and whether the results may be influenced by how they are loaded.
1594039452030.png

Your questions address only two of these scenarios (A & B). I have encountered scenarios C & D as well. Even though the test is performed without an operator, that test may be influenced by how a part was loaded into the station. If it was manually loaded, the operator can have an influence on the results.
  • Scenario A may be studied using an R study.
  • Scenario B may be studied using an R&R study where the station replaces operator for reproducibility
  • Scenario C is the standard R&R study
  • Scenario D may be studied using an expanded R&R study
Regarding your questions:
  1. The average/range method is an approximation that was developed BC (Before Computers) to make the calculations easier. ANOVA is more accurate and provides the interaction term. In addition, the expanded R&R study can only be analyzed using ANOVA. The number of test bays is an economic trade-off. The more bays you add, the more difficult it will be to keep them consistent adding both up-front and ongoing cost. Offsetting these costs would be an increase in throughput, so the ATE is not a constraint. Bottom line, add only as many as you absolutely need and no more.
  2. I have really answered this one above, but to confirm: Type 1 study only applies to scenario A. The method you proposed (ANOM) might work, but has not been studied to my knowledge. However, the standard R&R will work, so why use a different approach that you would have to defend to your customers? Cp/Cpk is not appropriate. However, you may want to consider Cg/Cgk.
 

Rumbero65

Starting to get Involved
For background, I have been working with MSA for over 30 years and have dealt with ATEs for over 20 years. ATEs can fall into four different scenarios depending on the number of stations and whether the results may be influenced by how they are loaded.
View attachment 26981

Your questions address only two of these scenarios (A & B). I have encountered scenarios C & D as well. Even though the test is performed without an operator, that test may be influenced by how a part was loaded into the station. If it was manually loaded, the operator can have an influence on the results.
  • Scenario A may be studied using an R study.
  • Scenario B may be studied using an R&R study where the station replaces operator for reproducibility
  • Scenario C is the standard R&R study
  • Scenario D may be studied using an expanded R&R study
Regarding your questions:
  1. The average/range method is an approximation that was developed BC (Before Computers) to make the calculations easier. ANOVA is more accurate and provides the interaction term. In addition, the expanded R&R study can only be analyzed using ANOVA. The number of test bays is an economic trade-off. The more bays you add, the more difficult it will be to keep them consistent adding both up-front and ongoing cost. Offsetting these costs would be an increase in throughput, so the ATE is not a constraint. Bottom line, add only as many as you absolutely need and no more.
  2. I have really answered this one above, but to confirm: Type 1 study only applies to scenario A. The method you proposed (ANOM) might work, but has not been studied to my knowledge. However, the standard R&R will work, so why use a different approach that you would have to defend to your customers? Cp/Cpk is not appropriate. However, you may want to consider Cg/Cgk.
Thank you Miner for the detailed explanation.

Even if 80% of our business is focused on scenarios A & B it is great to have an overall knowledge.
Just a two further question:

Question 1:
Why is conceptually not correct to perform Type-1 (cg/cgk) on Scenarios different from A? Cannot I perform a Type-1 and after a GR&R? I thought (correct me if I am wrong) before a GR&R it would be correct to check the machine repeatibility and stability so I was thinking to organize my SW in two sections:
- "Preliminary MSA" (type 1) aimed to verify repeatibility & bias on each single test bay
- "GR&R"

Question 2:
When you suggest "R Stud" for Scenario A, do you refer to Study 1? (Cg,Cgk testing one golden device 50 times)?

Thanks for your valuable support and you knowledge sharing,

Regards
 

Rumbero65

Starting to get Involved
Hello Rumbero65,

I am not familiar with Hardware and Test Software so forgive me if I go down the wrong track here but l will try to help.

Firstly, I am assuming that you are not being compelled to use any particular type of approach to the test? I ask because some folks start down a route and then may get audited down the line by a prime customer or organisation who expect something different.

My assumption from this point is that you are fairly flexible as to the options for the approach to this problem

Each Test Bay checks an output which can be referenced to some reference standard and these outputs across bays should not be detectibly different from each other. As you have mentioned a type 1 Gage study I assume that for each bay there is only one output of interest.

My approach would be to run a consistency study for one bay, the objective of this would be to understand whether or not the output is consistent before comparing the different bays to each other. The data gathering process would be the same as your Gage Type 1 Study except that you would input the data in time series order into an Individual moving Range Chart instead of the standard run chart.

At the basic level, any observations that beach the upper control limit indicate a lack of consistency which would need to be addressed before the summary statistics could be used. The achievement of consistency is the foundation level and an achievement in itself, don't forget to record anything that you may feel would contribute to the variation you see.

Now you have confidence that you can achieve consistency you can compute the Est standard deviation = Average moving Range/1.128 (n=2), you can compute the consumption of tolerance (min 96% confidence) =2.7 x Est standard deviation/tolerance band, if you a require a min 99.9%
confidence use 5.4 as the multiplier as opposed to 2.7.

To compare bays carry out the consistency check for each bay, ensure you have consistency, then carry out an equivalency study across all bays. For more details on equivalency see Dr Don Wheelers articles in Quality Digest, it's a simple test and you can check for equivalence of the standard deviation and bias if you wish using your reference sample.

This procedure I have outlined can be ran in its entirety in spcforexcel software, I'm not aware of its inclusion in other softwares but that doesn't mean it isn't there.

For more information on measurement consistency please refer to Dr Wheelers book EMPIII, its an invaluable resource.

Hope this helps!

Thank you Welshwizard,

just to be on the same page: in order to verify the "consistency" i think you mean to measure the same device lots of times, is it correct?
I ask you this because I don't get why you measure the Std.Dev. by using the average moving range (Sorry I'm quite a newbie. It sounds like a "within" std.dev): i expected to calculate the "usual" overall std.dev. Cannot I calculate the Cg/cgk as in Minitab for Type-1 analysis?

Regards,
 

Miner

Forum Moderator
Leader
Admin
Question 1:
Why is conceptually not correct to perform Type-1 (cg/cgk) on Scenarios different from A? Cannot I perform a Type-1 and after a GR&R? I thought (correct me if I am wrong) before a GR&R it would be correct to check the machine repeatibility and stability so I was thinking to organize my SW in two sections:
- "Preliminary MSA" (type 1) aimed to verify repeatibility & bias on each single test bay
- "GR&R"

Question 2:
When you suggest "R Stud" for Scenario A, do you refer to Study 1? (Cg,Cgk testing one golden device 50 times)?
  1. You can perform a Type 1 study IN ADDITION TO the studies I listed. Since a Type 1 study is a Repeatability study , running it in addition to an R&R (Repeatability and Reproducibility) study is normally redundant and unnecessary. However, since your intent is to verify the bias of each test bay, it can add value.
  2. Type 1 study = R (Repeatability) study
 

Rumbero65

Starting to get Involved
  1. You can perform a Type 1 study IN ADDITION TO the studies I listed. Since a Type 1 study is a Repeatability study , running it in addition to an R&R (Repeatability and Reproducibility) study is normally redundant and unnecessary. However, since your intent is to verify the bias of each test bay, it can add value.
  2. Type 1 study = R (Repeatability) study
Great now everything is almost clear. So If I'm not wrong we can resume as follow:

- without any Appraiser contribution to the variance, we can use Type-1 only.
- In case more than 1 test bays are used, and we want to verify their contribution to the variance we can use GR&R Anova replacing Operator with Test Bay.
- In case a GR&R is needed, it is not mandatory to perform Type-1 as preliminary check
- Since GR&R can be performed on a "under control" system only I suppose we can use a control chart (I-MR?) to verify it.

If no error here I have finished to bother you :)

Thanks!!!
 

Miner

Forum Moderator
Leader
Admin
- Since GR&R can be performed on a "under control" system only I suppose we can use a control chart (I-MR?) to verify it.
This would be a stability study of the measurement device, which does use a control chart. I think you understand the essence of the other items.
 

Welshwizard

Involved In Discussions
Thank you Welshwizard,

just to be on the same page: in order to verify the "consistency" i think you mean to measure the same device lots of times, is it correct?
I ask you this because I don't get why you measure the Std.Dev. by using the average moving range (Sorry I'm quite a newbie. It sounds like a "within" std.dev): i expected to calculate the "usual" overall std.dev. Cannot I calculate the Cg/cgk as in Minitab for Type-1 analysis?

Regards,
Verifying the consistency would involve taking repeated measurements of the characteristic of interest, yes.

The point about verifying the consistency is to ensure that the data is homogenous, we can use the ImR chart to indicate whether or not this is the case. If the process is not consistent there is no real point in characterising. It’s always good practice to use the within subgroup method of computing the std deviation I.e using the average moving range because it estimates the standard deviation that could be achieved if the process was consistent.

If you calculate and use the global std deviation ( in your words the overall std deviation ) you are assuming that the process is consistent and that is a big assumption to make. If the process is consistent there will be very little difference in the std deviation however you calculate it but of course you don’t know that unless you have some devine intervention!

As for using Cg/Cgk, I cannot explain why it does it doesn’t make sense any better than Dr Donald Wheeler. I asked a similar question to him around three years ago and he wrote a paper on it. You can find the paper in the Reading Room on his website www.spcpress.com, paper number 314 from May 2017 titled “ More Capability Confusion”, page 6 has the detailed explanation.


I hope this doesn’t confuse you, what I have found in over 35 years of using these techniques is that it’s healthy to question and satisfy yourself why, in this way you will have deeper learning and be able to stand behind your decisions with confidence.

Good luck
 
Top Bottom