# Determining sample size for inspection to achieve x% confidence re defects

S

#### Stabby

I apologize in advance for a very basic question and am a little embarrassed that I cannot dig into my memory banks and figure this one out. Also, if this is the wrong place for such a question, please refer me to where you feel it would be more appropriate.

We have 6 shipping containers containing 585 items / boxes each of the same item for a total of 3510 items / boxes. We strongly suspect a high rate of defective products but assume the high defect rate is limited to 1 or 2 batches (therefore only present in 1 or 2 containers).

If this assumption is correct, how many items / boxes do we need to open and inspect for each shipping container to give us 98% confidence that that container has an acceptable amount of defects (2% defect rate as opposed to 30% defect rate)?

Thanks for any and all help or referrals.

#### Stijloor

Super Moderator
A Quick Bump!

Can someone help?

Thank you very much!

#### Steve Prevette

##### Deming Disciple
Super Moderator
A difficulty is your statement: We strongly suspect a high rate of defective products but assume the high defect rate is limited to 1 or 2 batches (therefore only present in 1 or 2 containers).

Random sampling and associated sample size calculations assume that there is an equal chance that ANY item may be defective, and that if one item is defective, it does not affect those items around them.

However, if we ignore that worry for the moment, if you have a population size of 3515 and want a 98% probability that no more than 2% are defective, you would need to sample 190 with no defects found. This is calculated using the hypergeometric formula in Excel. By the way, if we simply assume a 'very large' population size and use the binomial, you get a sample size of 194.

=BINOMDIST(0,194,0.02,TRUE) gives you 0.98.

• Bev D and Marc
U

#### UncleCrazyHorse

I second Steve's comments. If there is a portion of the population that is known (or suspected) to contain a significantly higher defect rate, random sampling may provide an x% Confidence of y% Conformance result, but it would be invalid.

With a known (or suspected) higher failure rate within one or two lots, it would be important to sample each lot independently based on quantity and calculate a Confidence/Conformance interval lot-by-lot.

Please let us know what path you take and the outcome. You've piqued my interest!
Good luck.

S

#### Stabby

Thank you. I will!

S

#### Stabby

Thanks. Much appreciated. Hopefully this saves lots of time.

#### Mark Meer

Trusted Information Resource
if you have a population size of 3515 and want a 98% probability that no more than 2% are defective, you would need to sample 190 with no defects found. This is calculated using the hypergeometric formula in Excel.

Hi Steve,
Could you elaborate a bit on how this is done?
I know the Excel formula works as follows:

HYPGEOMDIST(Sample_s, Number_sample, Population_s, Number_pop), where
• Sample_s = the number of successes in the sample
• Number_sample = the size of the sample
• Population_s = the number of successes in the population
• Number_pop = the size of the population

...So how did you use this, when it is Number_sample that is the unknown? A solver of some sort?

#### Steve Prevette

##### Deming Disciple
Super Moderator
...So how did you use this, when it is Number_sample that is the unknown? A solver of some sort?

Exactly. It is a trial and error process. I wrote my own macro / solver in the attached file.

Usual caveats on "free" software - to the best of my knowledge, this software works, but I would expect any recipient of it to do their own independent V&V that it works (such as test against know statistical textbook problems).

#### Attachments

• Sample_Size.xls
44 KB · Views: 255
• Mark Meer and (deleted member)

#### Mark Meer

Trusted Information Resource
Oooo. Custom Excel macros. Excel just got a million times more complicated! Thanks Steve. I'll take a look....

Two initial questions:

1. In the hypergeometric sheet, your calculation for "C" (num successes) is:

INT((A2*C2)+0.5), where A2 is the defect probability; and C2 is the population size.

Why the "+0.5"?

2. Your use of the HYPGEOMDIST formula is:
=HYPGEOMDIST(0,\$D\$8,\$C\$8,\$C\$2)-(1-B2)

Why do you subtract "1-B2" (B2 = Confidence Level)?

#### Steve Prevette

##### Deming Disciple
• 