ANSI Z1.4 or Z1.9 for non-normal distributions?

A

AHartman

Hi all! I have had some college courses about SPC, control charts, statistics, etc, but haven't ever applied it in the real world. Well, I'm now tasked with developing a reduced inspection plan for one of our products. We currently do 100% inspection, and it's simply too great a drain on our resources to continue. I'm the only person in our company that comes close to a process or quality engineer, so I've got to find the solution on my own.

We've determined from customer feedback that if we ship product that is better than 90% conforming (or less than 10% non-conforming), they'll be happy. We also make less than 150 parts per day. So, I decided to try the MIL-STD-1916 and apply it to our process. I'd like to start out w/ Attribute sampling, just because it seems quicker to implement, but ultimately I'd like to use Variables sampling to set myself up for process improvements in the future. Two immediate problems came up:

1) Our data is very simple, and non-normal. Using inspection, we score our parts for roundness on a discrete scale of 1 to 5, with any parts scoring 3 or higher considered round enough to pass. Though this is a step up from our previous pass/fail inspections, it still doesn't give much information. However, this is the inspection we're stuck with for the time being as it's too difficult to get more detailed data in a quick and cost effective way. In addition, our data is not normally distributed. There are maybe 15% of parts scoring 3, about 30% scoring 4, and the remaining 55% are scored as 5. So, I'm not sure if it's even appropriate to use the MIL-STD-1916 procedures.

2) In the MIL-STD-1916 document, the Verification Level (VL) is always treated as a specification. The handbook that goes along with the standard gives the guideline that "for critical characteristics, a VL of VII is to be used". While the roundness score of our parts is indeed critical, I suspect that applying VL VII would be overkill for our 90% conforming requirement. I don't want to reject every lot simply because I'm using a too strict procedure. I can't find ANYWHERE that describes how the different VL's are calculated, or what each VL translates to in terms of a percentage of conforming parts. Also, I believe that I need to use an AQL (Acceptable Quality Level) type measurement, but did not find that in the MIL-STD-1916 document. The closest was the "k" value, which is called the acceptability criterion. And no information is presented as to how the "k" values are determined other than in reference to a specified VL. I need to be able to link it to how many conforming parts there are. The handbook provides OC, AFI, and AOQ charts, but these are still just given for each VL, and I'm not sure how to read them to get the % conforming translation.

Given all of this (and thanks to anyone who's are still reading this far into my epic post), I did some searching and found that the ANSI Z1.4 and Z1.9 standards are essentially similar to the MIL-STD-1916 standard. Also, I was able to find a few excerpted tables that seem to indicate that the ANSI standards have a very clear method of translating a desired percent conforming into an inspection test. I've not yet gotten a copy of the ANSI standards, but they seem like a better tool than the MIL-STD-1916 in this instance. The ANSI standard solves my #2 problem above, but I'm still not sure if it's kosher to apply it to a non-normal distribution.

I don't have access to MiniTab, so I can't use it's fancy tools to normalize the data, and I must confess that the few classes I've had never went beyond normal distributions other than to mention that other distributions were out there. I think that, in my understanding, the central limit theorem says that even if a population is not normally distributed, a random sample from that population will be more normal than the population itself, and that a cheap way of possibly "normalizing" the data would be average groups of sample measurements. That is, instead of 3 random samples from each of 6 machines giving n=18, I could average the samples from each machine for n=6. But, it seems to me that I would then only be saying that the original 18 samples should be accepted or rejected, not the larger lot from which they were drawn.

So, given that we've got small lot sizes to begin with, and the process is pretty heavily biased toward one end of our measurement scale, what's a good standard to use to know that we're shipping better than 90% conforming product? Can I use the Z1.4 and Z1.9 standards on non-normal distributions by using more samples and averaging? Do those ANSI standards still work pretty well for a 90% conforming requirement such that the non-normal distribution can be ignored? Is AQL the right way to conclude acceptance, or is a Lot Total Percent Defect (LTPD) measurement better?

I apologize for my almost utter lack of knowledge in this area, and hope that I'm not asking really dumb questions. As you can tell, our inspection requirements are not nearly as strict as large scale industrial production, but I'd still like to use an industry standard, both for customer satisfaction and for my own education, if possible. Our 100% inspection records clearly indicate that we're meeting or beating our 90% conforming requirement, so I'm hopeful that there's a reduced inspection process out there that can help me.

Thank any and all of you in advance for taking the time to pour over this huge post, and for any comments, advice, or suggestions you may have!

Adam Hartman
Mechanical Engineer
Zyvex Corporation
 
Last edited by a moderator:

Tim Folkerts

Trusted Information Resource
We've determined from customer feedback that if we ship product that is better than 90% conforming (or less than 10% non-conforming), they'll be happy.

Your customers seem to be pretty relaxed about defects! :tg:

I'd like to start out w/ Attribute sampling, just because it seems quicker to implement...

Can I use the Z1.4 and Z1.9 standards on non-normal distributions by using more samples and averaging? Do those ANSI standards still work pretty well for a 90% conforming requirement such that the non-normal distribution can be ignored? Is AQL the right way to conclude acceptance, or is a Lot Defect Percent Defect (LTPD) measurement better?
For Z1.4, the distribution makes no difference - it is simply pass-fail. AQL is probably not the way to go here. AQL = 10 means that up to 10% failures will almost always be accepted by your customer, and even lots considerably over 10% will often be accepted. I think you are looking for the reverse - that 10% and over will almost certainly be rejected, and even some lots a little under 10% will be rejected.

In other words, you want to assure that the lots are no worse than 10% bad, rather than having the customer ussure you that the lots are indeed worse than 10%. (If that isn't the case, then the rest here will be off-topic)

There are a couple options for choosing a sampling plan. You could check the OC curves in the Z1.4 standard and looks for a plan where 10% defective will only be accepted some small % of the time (say 5% or 1%).

You could also calculate the numbers directly. Suppose p = 10% = 0.1 of the parts are bad and you want no more than beta = 5% =0.05 chance of accepting a bad lot. Drawing 1 or 2 or 3 pieces that are all good would not be surprising.

I will only be surprising when (1-p)^n < beta

or equivalently, when n > log(beta) / log(1-p)

In this case, n > log(0.05) / log(0.9) = 28.4, so if you get 29 in a row that are good, you would be rather surprised if the quality were 10% bad, so you are 95% sure it is no worse than 10%.


(Actually, these calculations are based on a very large lot size. With only 150 total, the situation improves and you could probably use a few less than 29, but 29 is still a conservative estimate.)

I made a spreadsheet that shows some of the calculations, which you could see in this thread. https://elsmar.com/elsmarqualityforum/threads/12836


I apologize for my almost utter lack of knowledge in this area, and hope that I'm not asking really dumb questions.
Actually, it all sounded reasonably intelligent. And everyone is a novice at some point in every activity they try. :)


Tim F
 
A

AHartman

Tim,

Thanks for your really great reply. It's good to know that there's a community out here willing to help along others, and I appreciate it.

Via your explanation, I've taken another look at the MIL-STD-1916 procedure. If I interpret your discussion of the OC curves correctly, I could specify my own VL simply by reading each OC curve and finding where a 10% non-conforming point corresponds to some high probability of acceptance. Is that correct? If so, it seems that the very best I can do, using the lowest level, VL R, is a probability of acceptance of only 75% for a 10% non-conforming lot. Does that sound right? If so, that's way too low of an acceptance for us. As I said before, I don't have access to a Z1.4 standard (though I think I should probably buy one soon!), but do you know if it includes plans that are more suitable to our relatively high non-conforming percentage?

I assume that I could counter-act this low acceptance probability by inspecting more than the small amount required in the VL R plan, but still applying the VL R criteria. That still leaves it as a gut feeling instead of a verifiable, rational decision. And I'd rather stay away from gut feelings until I'm more experienced.

Also, thank you for your link to your previous post regarding the Excel sheet. I'm not entirely sure I understand the 4 inputs, but I'm playing around with it to see what it can tell me. For the range of AQL = 9% (as per your comments that a 10% AQL wouldn't be the best bet for me), and a UQL of 15%, with alpha and beta at 5%, I get sample sizes that are larger than my lot size. So, I'm right back to 100% inspection. I'm guessing that I'm not using the inputs correctly, but I'll keep twiddling to see what understanding I can gain.
 

Tim Folkerts

Trusted Information Resource
You are kind of caught in a tough place. For pass/fail testing, you need a large sample. For a variable measurements, you really want better resolution than the 5 categories you have.

As for values to choose for AQL this depends on your desires. Let's define:

  • ADR = acceptable defect rate = a level of defects that every one can live with (but less is of course better!)
  • UDR = unacceptable defect rate = a level of defects that is definitely unaccepatable. (Does that make these lots "UDR failures"? :lol:)
  • alpha = odds of rejecting a lot which is actually at ANR
  • beta = odds of accepting a lot which is actually at UNR
(I should have used ADR & UDR in the spreadsheet; alpha & beta are standard nomenclature)

If you want to be sure that 10% defects will typically be rejected (and higher defects rates almost certainly be rejected, you might choose
ADR = 1, alpha = 0.05
UDR = 10, beta = 0.05


You could look in the Z1.4 OC tables (I have them now) and find that Sample Code H, AQL1 is close to these values:
ADR = 0.72, alpha = 0.05
UDR = 9.1, beta = 0.05

This requires a sample of 50, with Ac= 1, Re = 2. Note that even with this AQL=1 plan, 10% defect lots still get past 5% of the time!

Or if you put
ADR = 1, alpha = 0.05
UDR = 10, beta = 0.05
into the spreadsheet, you find that sample sizes near 50 work with Ac= 1, Re = 2.


If you are confident that usually you are below 10%, you might choose
ADR = 1, alpha = 0.05
UDR = 15, beta = 0.05


Then samples with 10% defects would be accepted more than 5% of the time (about 12% actually), but if you hardly ever make such a bad lot, who really cares! If you do accidentally make a really bad lot, like 20% defective, you will still catch it. In this case, you could inspect 30 pieces with Ac = 0.

Tim
 
A

AHartman

Tim,
Again, thanks for the great answer. I start to see now how the spreadsheet works. On average, 100% inspection of our product has given us about 6% defects. So, I think your logic behind the 1-15 scenario makes good sense. Also, since we only ship in prepackaged groups of 20 pieces, the difference between 10% and 20% is only 2 bad pieces. I'll look at the numbers via your spreadsheet and think it over. It looks like I'll also get around to ordering the Z1.4 standard :)

Adam Hartman
Mechanical Engineer
Zyvex Corporation
Dallas, TX
 
A

AHartman

You are kind of caught in a tough place. For pass/fail testing, you need a large sample. For a variable measurements, you really want better resolution than the 5 categories you have...

Tim,

Just for continued discussion, I've (hopefully) attached a picture representing a typical histogram and cumulative percentage for a day's production run. As you can see, 6% of our product was at the low but acceptable level 3, 26% were at the pretty good level 4, and 68% were at the mostly perfect level 5. On that day's run, there were no non-conforming units. I went through the exercise of implementing a Box-Cox transform in excel, just for educational purposes, but that simply created a scaled replica of the original :biglaugh: Upon retrospection, that seems like an obvious outcome, having just converted three different numbers into three other, different albeit larger numbers with the same frequency. Clearly, discrete data in 3 categories can't be meaningfully transformed.

I think that it's pretty clear in my mind that the only way to reduce inspection while maintaining our 1-5 scoring system is to use attribute sampling. It's a great fit because all I have is attributes, really. I've got a suspicion, though, that my physicist boss may want a variables sampling plan sometime in the future. It seems like an in between solution is to use a control chart and to look at averages from some random samples of each day's production. If a particular day's average is more than 3 sigma (because it's a one-sided situation) from the historical mean, it would seem to me that the entire lot is likely suspect and should be 100% inspected.

Is that sound reasoning? Can you think of any other ways I might approximate a variables sampling plan? Or, do you know of any plans that would work for us? I know you said earlier that we really just need greater resolution to use a variables plan, so maybe you've already answered that by suggesting a different, more discriminating scoring procedure.


Adam Hartman
Mechanical Engineer
Zyvex Corporation
 

Attachments

  • ANSI Z1.4 or Z1.9 for non-normal distributions?
    AHartman_Histogram.jpg
    21.3 KB · Views: 356
Top Bottom