# Choosing the correct Distribution for Acceptance Sampling

L

#### lp026713389

Hello everyone,

I'll try and be as organized as I can in this post, as it is my first. I am currently a senior student in industrial engineering, doing my co-op graduation project in acceptance sampling.

I am developing a type A sampling plan for attributes (with type A OC curve) to sample an isolated lot, not a process. What I have so far, are AQL which I have obtained from a real producer, and RQL which I have obtained from a consumer who works with that same producer. I work with the consumer, but I want to implement both values in my plan (when drawing the OC curve). I have been reading a lot of literature about the topic lately, however a lot of things in what I've read don't seem to add up for me. Here are a few:

1. While reading about which distribution to use while developing my plan, I got confused about what exactly are the criteria to choose the hypergeometric or the binomial approx. to the hypergeometric. Hypergeometric is typically used for small/finite lots, however some literature mentions that the binomial approx. is accurate as well, if N (population size) is at least 10 times greater than n (sample size).

The problem here is, I am actually trying to determine which distribution to use IN ORDER TO get n (whether by calculation or tables). But the above condition assumes that I already know n (which I am trying to determine). It seems a bit illogical to me, that the condition assumes I have a value which I need to get by a choice I will make based on that condition. Any help here?

2. Is there any other condition based on which I can choose whether I can use the binomial approximation or not? For example instead of having to do with n, I am looking for a condition like "If N (population/lot size) is >= a certain value, the binomial approx. can be used." That would make a lot more sense to me than the condition depending on n (sample size, which I am trying to get!)

3. If the hypergeometric distribution is to be used, I can't find any concrete methods to calculate the sample size I should take based on the size of the lot received. For the binomial distribution/approximation though, I found the Larson nomogram, ANSI Z1.4 tables, etc., which gives n and c (but is not based on N). I couldn't find any method to get the required sample size depending on the lot size for the hypergeometric distribution though.

4. Even if there was some method to calculate/obtain n depending on N using the hypergeometric distribution, how do I get c (acceptance number) in order to sentence the lot?! Again, for binomial, c can be obtained from the nomogram, standard tables, etc.

5. Also, just a general question, what distributions are the MIL-STD-105E and ANSI Z1.4 standards based on?

I am really confused here and the clock is ticking! Any help is appreciated.

Thanks!

Edit: I have just put the AQL and RQL into Minitab, which are 2.8 & 6% respectively in my case. With alpha and beta of 0.05 and 0.1 respectively, and ticking the use hypergeometric distribution for isolated lots, I got n=194, and c=8. How did Minitab arrive at these values?

Last edited by a moderator:
L

#### lp026713389

Bev D, Statistical Steven, Tim Folkers, etc...anyone? I could really use the help :/

Last edited by a moderator:
L

#### lp026713389

Shameless bump out of desperation

#### Steve Prevette

##### Deming Disciple
Super Moderator
I'll take a crack at this since there have been no responses. I'm not big into "choosing distributions" for data, but will give my recommendations.

1. While reading about which distribution to use while developing my plan, I got confused about what exactly are the criteria to choose the hypergeometric or the binomial approx. to the hypergeometric. Hypergeometric is typically used for small/finite lots, however some literature mentions that the binomial approx. is accurate as well, if N (population size) is at least 10 times greater than n (sample size).

The problem here is, I am actually trying to determine which distribution to use IN ORDER TO get n (whether by calculation or tables). But the above condition assumes that I already know n (which I am trying to determine). It seems a bit illogical to me, that the condition assumes I have a value which I need to get by a choice I will make based on that condition. Any help here?

There are many times in sampling where you fall into a chicken and egg scenario of - I need to know some data to get the N. Usually that means sampling some small amount to get an idea of the failure rate in order to know your sample size. But if you are sampling to "prove" you are better than a certain rate, you can use the specified rate in the formulae as a starting point.

2. Is there any other condition based on which I can choose whether I can use the binomial approximation or not? For example instead of having to do with n, I am looking for a condition like "If N (population/lot size) is >= a certain value, the binomial approx. can be used." That would make a lot more sense to me than the condition depending on n (sample size, which I am trying to get!)

Generally the amount of extra effort to do the hypergeometric calcuations only saves you a handful of sampling. Binomial is more conservative - if you meet the binomial criteria, you will also meet the hypergeomtric.

Best thing is to try some scenarios and prove that to yourself.

3. If the hypergeometric distribution is to be used, I can't find any concrete methods to calculate the sample size I should take based on the size of the lot received. For the binomial distribution/approximation though, I found the Larson nomogram, ANSI Z1.4 tables, etc., which gives n and c (but is not based on N). I couldn't find any method to get the required sample size depending on the lot size for the hypergeometric distribution though.

I've written a trial and error approach using an excel macro if you specify the N and the desired failure rate and confidence levels. It's not too hard to do, and an advantage of knowing how to do things yourself in Excel rather than relying on a 'black box' method. I'll attache the file.

4. Even if there was some method to calculate/obtain n depending on N using the hypergeometric distribution, how do I get c (acceptance number) in order to sentence the lot?! Again, for binomial, c can be obtained from the nomogram, standard tables, etc. See the file attached.

5. Also, just a general question, what distributions are the MIL-STD-105E and ANSI Z1.4 standards based on?

There are various books and papers out there with the history. One example is (broken link removed)

I am really confused here and the clock is ticking! Any help is appreciated.

Thanks!

Edit: I have just put the AQL and RQL into Minitab, which are 2.8 & 6% respectively in my case. With alpha and beta of 0.05 and 0.1 respectively, and ticking the use hypergeometric distribution for isolated lots, I got n=194, and c=8. How did Minitab arrive at these values?

No idea, I don't trust Minitab.

#### Attachments

• Sample_Size.xls
66 KB · Views: 207
L

#### lp026713389

Thanks for taking the time to reply, I really do appreciate it. I have one question left though, after checking your attachment. What method/formula have you used to determine the value highlighted in the attached screenshot?

Just in case, the value in question is the number of samples (sample size) required under the hypergeometric spreadsheet.

#### Attachments

• Capture.JPG
9.1 KB · Views: 186

#### Steve Prevette

##### Deming Disciple
Super Moderator
Thanks for taking the time to reply, I really do appreciate it. I have one question left though, after checking your attachment. What method/formula have you used to determine the value highlighted in the attached screenshot?

Just in case, the value in question is the number of samples (sample size) required under the hypergeometric spreadsheet.

The binomial page uses the binomial formula in Excel, the hypergeometric page uses the hypergeometric formula. Note that it requires a macro in both cases to solve by trial and error.

N

#### ncwalker

Steve,

Have you ever played around with the "goal seek" function in Excel?

You may not actually need the macro.

(I say this without having looked at your sheet.)

#### Steve Prevette

##### Deming Disciple
Super Moderator
Steve,

Have you ever played around with the "goal seek" function in Excel?

You may not actually need the macro.

(I say this without having looked at your sheet.)

True, goal seek would also work.

L

#### lp026713389

Gents, thanks for breathing some life back into the thread haha. I'd like your advice on a certain situation I'm facing right now in my project. I have a lot of 821 units, but I don't know whether to follow the binomial method or hypergeometric method to develop a single sampling plan.

If I were to use the binomial, lot size would have no bearing and from the Larson Nomogram I would get a sample size of 120, with an acceptance number of 3. This is based on an AQL of 1%, RQL/LTPD/UQL/LQ of 6%, alpha of 0.05 and beta of 0.1 (these 4 values I have obtained based on discussions with the company I'm involved with - the consumers in this case, as well as the producers who have supplied us with this lot). However, this 120 makes up about 15% of the total lot size (821 units), and so the condition based on which I should use the binomial approximation would not be satisfied (N must be at least 10 times greater than n).

Moreover, since this is an isolated lot (not continuous process), it makes more sense to use the hypergeometric method from what I've read. Now, I have a problem which Steve has tried to help me out with. There is no concrete/obvious method for calculating the size of the sample I should take based on the hypergeometric distribution. Steve's excel file indicated that I should sample 465 units for a lot size of 821 units, and take the acceptance number = 0. But what is this based on? I have to reference some sort of formula or method I use in obtaining this value since this is my graduation project after all. Can anyone point me to any literature on this?

Also, if I do use this method (hypergeometric), does the acceptance number have to be 0? If not, how should I calculate it? A c=0 value works in my project's favor actually, but is it always 0 if the hypergeometric distribution is used? I read that c=0 plans are based on the hypergeometric, but these plans use an AQL table to determine the lot size. After checking the AQL table, I saw that based on a 1% AQL I would get a sample size of 34 for a lot of 821 units, not a sample size of 465 as was calculated in Steve's sheet.

As you can see, the issue is real hazy and there are a lot of question marks on this.

Lastly, how does 1 develop a double sampling plan based on a finite lot (hypergeometric distribution)? 99% of the literature I've read about this uses the binomial by default, but as I've explained, my case keeps pointing to me using the hypergeometric instead. Any advice/literature/points on this issue?

As always, ANY input would be helpful!

Thanks guys