# How to determine the Lot Size for the OQ Sampling Plan

E

#### edwardkwan

Hi guys,

I need your expertise for the issue I have with respect to getting the justification on the lot size of the below:

Operational Qualification using with the below conditions:
• Single sampling plan for attribute or pass/fail data with c=0.
• Sample selected is representative
• For 95% Confidence:
LTPD, Reliability, Single Sampling Plan Parameters, AQL α=0.05
[%] [%]

5 95 n = 59, c = 0 0.09%
3 97 n = 99, c = 0 0.05%
1 99 n = 299, c = 0 0.02%
• LTPD β=0.05

After the risk analysis done by our customer, the plan 95% Confidence and 99% Reliability, sample size out to be 299 with c=0 (n=ln(1-Confidence)/ln(Reliability) has been selected for the attribute plan.

As indicated by our customer's specification, the batch size,N, in this case is assumed to be 2,990 pcs (N=10xn).

The issue is, I have trouble trying to find out the rationale for 10xn for justification of the statistical rationale. Any statistical reference and explanation would be of great help and much appreciated. By the way we are Contract manufacturer for plastic injection molding for Medical components.

#### Statistical Steven

##### Statistician
Staff member
Super Moderator
Hi guys,

I need your expertise for the issue I have with respect to getting the justification on the lot size of the below:

Operational Qualification using with the below conditions:
• Single sampling plan for attribute or pass/fail data with c=0.
• Sample selected is representative
• For 95% Confidence:
LTPD, Reliability, Single Sampling Plan Parameters, AQL α=0.05
[%] [%]

5 95 n = 59, c = 0 0.09%
3 97 n = 99, c = 0 0.05%
1 99 n = 299, c = 0 0.02%
• LTPD β=0.05

After the risk analysis done by our customer, the plan 95% Confidence and 99% Reliability, sample size out to be 299 with c=0 (n=ln(1-Confidence)/ln(Reliability) has been selected for the attribute plan.

As indicated by our customer's specification, the batch size,N, in this case is assumed to be 2,990 pcs (N=10xn).

The issue is, I have trouble trying to find out the rationale for 10xn for justification of the statistical rationale. Any statistical reference and explanation would be of great help and much appreciated. By the way we are Contract manufacturer for plastic injection molding for Medical components.
Keep looking for the reference to 10xn...because it does not exist. There is not relationship between the sample size and lot size using the ln(1-confidence)/ln(reliability). It assumes a "large" population of parts to be sampled from. I believe they just "made up" the 10xn.

E

#### edwardkwan

Hi Steven,

Thanks for your reply. It may be from their assumption as their requirement indicated that "This sampling plan is based on Type A OC Curve, which describes protection for individual batches (batches not from a continuously running process) assuming approximately 3000 piece batches (Batch size N approximately 10X sample size n)". With that being said, will it make any difference as far as statistical validity is concerned if the lot size never factored in the first place as I foresee the question of proportion ( n sample size out of Lot size of N) will come into play. And also if the Lot size does not matter or valid for the statistical rationale, I am questioning economical and practicality of their requirement.

Appreciate everyone's help on this.

Rgds,

Ed

#### Bev D

##### Heretical Statistician
Staff member
Super Moderator
Lot size does not matter (unless it is a very small lot adn you are no where near that size)
The ancient Mil-Std/ANSI standards on AQL sampling plans do take lot size into account but they were the result of negotitions more than statistics.

There is no statistically valid reason to take lot size into account. The ratio you are being quoted was pulled out of thin air.

Of course they may be using this 'reasoning' to ensure a sufficiently scaled batch for OQ - but this is a very convoluted way of doing that.

I might question the use of AQL based sample plans for OQ but that is another thread.

C

#### COECON

Having some similar sampling number rational problems. Can you supply me with the
formula with exact result of = 59 please

#### Bev D

##### Heretical Statistician
Staff member
Super Moderator
Having some similar sampling number rational problems. Can you supply me with the
formula with exact result of = 59 please
hi COECON: can you provide more context for your question?
I can think of about a billion equations that will result in an answer = 59, including the year I was born in

C

#### COECON

hi COECON: can you provide more context for your question?
I can think of about a billion equations that will result in an answer = 59, including the year I was born in
Hi Bev D,
thanks for you quick reply:
Ok I know there is no relationship between sample size and lot size. But there are a lot of recommendations to use at least a sample size of 59 because of statistic evidence.

I am preparing a rational who the recommendation of a attributive sample test is recommended on the base of 95% reliance and 95% confidence using a binomial distribution.
There should be no failure in the samples!
Im struggling wit the formula giving the 59 as a result!
Would be happy for your help

#### Statistical Steven

##### Statistician
Staff member
Super Moderator
Hi Bev D,
thanks for you quick reply:
Ok I know there is no relationship between sample size and lot size. But there are a lot of recommendations to use at least a sample size of 59 because of statistic evidence.

I am preparing a rational who the recommendation of a attributive sample test is recommended on the base of 95% reliance and 95% confidence using a binomial distribution.
There should be no failure in the samples!
Im struggling wit the formula giving the 59 as a result!
Would be happy for your help
For c=o plans, the general formual is n=ln(1-Confidence)/ln(Reliability)

So for 95% confidence/90% reliability it is 28

Here are some other examples

Conf Rel. n
0.95 0.9 28
0.95 0.95 58
0.95 0.99 298
0.99 0.9 44
0.99 0.95 90
0.99 0.99 458

#### mdurivage

##### Quite Involved in Discussions
Steven's method is what I prefer to use. The only thing is make sure the numbers you use are risk based and spelled out in a procedure.

#### Bev D

##### Heretical Statistician
Staff member
Super Moderator
We modify our sample size based on lot size only when the LOT size is less than ~about 50. This choice is rather arbitrary as it will depend on how accurate we wnat our estimate to be.

When lot sizes get 'small', attribute (categorical, pass/fail) sampling results will follow a hypergeometric distribution and not the general approximation of the Binomial which applies to 'large' lot sizes (theoretically infinite, in practice greater than ~50). The other rule of thumb for adjusting yoru sampel size for lot size is when the sample size is ~ 5% of the lot size or greater...I rarely use this approach as I've found in practice that it really make no practical difference but I do use the lot size <50 rule...

Technically speaking the use of the hypergeometric applies when we sample without replacement from a finite lot. This is because the probability of finding something in a lot changes as the lot size decreases when we pull a sample and don't replace it. (this is the probability that we all despised in high scholl or college: you are getting dressed in the dark and you 20 pairs of socks randomly scattered in your sock drawer. 10 pairs are black, 5 pairs are blue, 4 pairs are brown and 1 pair is pink. what are the odds that you pick at least 1 pink sock? my answer was always: none becuase I am going to turn the lights on)

the Binomial applies perfectly when we are sampling from an infinite population: if you flip a coin...this also works when you sample with replacement.

Neither of these situations fit the practical industrial world. Fortunately in practice once the lot gets over~50 the differences in teh resutlsbetweent he two distribtuions is so small as to have no practical meaning; the differences will have no effect on the ultimate decision we make.

The hypergeometric is complicated to calculate so the 'rule of thumb' was established years ago and is mostly accepted without the mathematical curves since they're a bit of work to generate. Not that I'm against the propagation of knowledge I just would rather re-derive other more impactful rules than this one...perhaps one of our other statisticians has this derivation handy for posting?

The formula used to modify the sample size (calculated the 'standard way' without regard to lot size) is generally taken to be: N/(N+n) where N = lot size and n = sample size.