# Determining sample size for inspection to achieve x% confidence re defects

S

#### Stabby

I apologize in advance for a very basic question and am a little embarrassed that I cannot dig into my memory banks and figure this one out. Also, if this is the wrong place for such a question, please refer me to where you feel it would be more appropriate.

We have 6 shipping containers containing 585 items / boxes each of the same item for a total of 3510 items / boxes. We strongly suspect a high rate of defective products but assume the high defect rate is limited to 1 or 2 batches (therefore only present in 1 or 2 containers).

If this assumption is correct, how many items / boxes do we need to open and inspect for each shipping container to give us 98% confidence that that container has an acceptable amount of defects (2% defect rate as opposed to 30% defect rate)?

Thanks for any and all help or referrals.

#### Stijloor

Staff member
Super Moderator
A Quick Bump!

Can someone help?

Thank you very much!

#### Steve Prevette

##### Deming Disciple
Staff member
Super Moderator
A difficulty is your statement: We strongly suspect a high rate of defective products but assume the high defect rate is limited to 1 or 2 batches (therefore only present in 1 or 2 containers).

Random sampling and associated sample size calculations assume that there is an equal chance that ANY item may be defective, and that if one item is defective, it does not affect those items around them.

However, if we ignore that worry for the moment, if you have a population size of 3515 and want a 98% probability that no more than 2% are defective, you would need to sample 190 with no defects found. This is calculated using the hypergeometric formula in Excel. By the way, if we simply assume a 'very large' population size and use the binomial, you get a sample size of 194.

=BINOMDIST(0,194,0.02,TRUE) gives you 0.98.

U

#### UncleCrazyHorse

I second Steve's comments. If there is a portion of the population that is known (or suspected) to contain a significantly higher defect rate, random sampling may provide an x% Confidence of y% Conformance result, but it would be invalid.

With a known (or suspected) higher failure rate within one or two lots, it would be important to sample each lot independently based on quantity and calculate a Confidence/Conformance interval lot-by-lot.

Please let us know what path you take and the outcome. You've piqued my interest!
Good luck.

#### Mark Meer

Trusted Information Resource
if you have a population size of 3515 and want a 98% probability that no more than 2% are defective, you would need to sample 190 with no defects found. This is calculated using the hypergeometric formula in Excel.
Hi Steve,
Could you elaborate a bit on how this is done?
I know the Excel formula works as follows:

HYPGEOMDIST(Sample_s, Number_sample, Population_s, Number_pop), where
• Sample_s = the number of successes in the sample
• Number_sample = the size of the sample
• Population_s = the number of successes in the population
• Number_pop = the size of the population

...So how did you use this, when it is Number_sample that is the unknown? A solver of some sort?

#### Steve Prevette

##### Deming Disciple
Staff member
Super Moderator
...So how did you use this, when it is Number_sample that is the unknown? A solver of some sort?
Exactly. It is a trial and error process. I wrote my own macro / solver in the attached file.

Usual caveats on "free" software - to the best of my knowledge, this software works, but I would expect any recipient of it to do their own independent V&V that it works (such as test against know statistical textbook problems).

#### Attachments

• 44 KB Views: 240

#### Mark Meer

Trusted Information Resource
Oooo. Custom Excel macros. Excel just got a million times more complicated! Thanks Steve. I'll take a look....

Two initial questions:

1. In the hypergeometric sheet, your calculation for "C" (num successes) is:

INT((A2*C2)+0.5), where A2 is the defect probability; and C2 is the population size.

Why the "+0.5"?

2. Your use of the HYPGEOMDIST formula is:
=HYPGEOMDIST(0,\$D\$8,\$C\$8,\$C\$2)-(1-B2)

Why do you subtract "1-B2" (B2 = Confidence Level)?

#### Steve Prevette

##### Deming Disciple
Staff member
Super Moderator
I add +0.5 to make the INT function round properly. For example, INT of 6.7 is 6. But I really want it to round to 7, so by adding 0.5, INT of 7.2 gives me 7.

I subtract the confidence level from 100%, because the statement that I want to be 95% confident that no more than 2% are defective, really means that the probability of detecting NO failures when there are 2% defects must be less than 5%.