# Help needed in choosing the method of calculating the minimum sample size

#### scooon

##### Registered
Hi everyone,
I need to calculate minimum sample size with this parameters:
- provide 95% confidence
- meet the company's conditions as to the maximum defect level (e.g. 11 ppm)
- solution need to be flexible (for example for daily population, monthly population, yearly population, 1000, 10000 or 10000000 samples)
- values are binomial (defected or not defected)

1. Military standard MIL STD 105D not provide data for populations much larger than 500000, for example if I have 10000000 population, in theory I still only need to take 2000 samples and I can't find any defect, because one defect disable whole population.
2. The "Sample size estimation" function in Minitab does not take into account the population size and shows low values of the necessary samples.
3. I found something like this calculator (cogentqc.com/tools-resources/statistical-calculator) but I don't know what math calulations it make.
4. I found this "Handbook" (itl.nist.gov/div898/handbook/prc/section2/prc242.htm) but I am not sure that this is good solution in my case, and I can't found/calculate needed parameters, I know only what mean "h".

I need Your help. Thanks! Have a nice day!

#### Randy

Super Moderator
Minimum sample size of what or for what? The internal MS audit?

#### scooon

##### Registered
Minimum sample size of what or for what? The internal MS audit?
Minimum Sample Size - Finished product pieces needed to be taken to guarantee the level of manufacturing defect for all production below eq. 11 ppm.

#### Steve Prevette

##### Deming Disciple
Super Moderator
From a Mil Std perspective, the tables are NOT based on population, but "lot sizes". So you really would not have a 10000000 lot size in reality. The Mil Std is set up for continuous sampling, if I want to go by day, then what is the production "batch" for the day. There may be also more technical reasons for what is a "batch" (perhaps after a tooling change for example, or what is the longest I want to wait to determine if there is a problem).

Keep in mind that random no - go sampling follows the hypergeometric distribution for small sample sizes, but once you get over 1,000 in the "batch" not much is gained statistically by increasing the sample size, the gains diminish rapidly. The binomial is basically infinite populations. For example most polling to determine the opinions of the United States (330 million people) is based on usually 500 responses. I should note that these polls are usually around a 50% "defect" level while if you are sampling for a 0.1% defect level for 330 million units, you need a MUCH larger sample size, but still reaches a certain assymptote.

#### Steve Prevette

##### Deming Disciple
Super Moderator
I do have some excel files for determining sample sizes from binomial and hypergeometric given you want to be X% sure that no more than Y% are defective. You can work that out in Excel and there are solvers out there. One rule to keep in mind is for a LARGE population, if you sample 59 items and have no defects, you are 95% confident that no more than 5% are defective.

#### scooon

##### Registered
I do have some excel files for determining sample sizes from binomial and hypergeometric given you want to be X% sure that no more than Y% are defective. You can work that out in Excel and there are solvers out there. One rule to keep in mind is for a LARGE population, if you sample 59 items and have no defects, you are 95% confident that no more than 5% are defective.
Thank you for the explanation, I am already smarter about this knowledge. If I understand correctly, none of the above-mentioned methods will suit me. In my case, samples are taken several times a day, but the quality requirements (ppm) are so low that the daily sample is not sufficient to determine if we are making enough samples during the year and if we can do less while maintaining the same quality level.

#### Steve Prevette

##### Deming Disciple
Super Moderator
Thank you for the explanation, I am already smarter about this knowledge. If I understand correctly, none of the above-mentioned methods will suit me. In my case, samples are taken several times a day, but the quality requirements (ppm) are so low that the daily sample is not sufficient to determine if we are making enough samples during the year and if we can do less while maintaining the same quality level.

True, a daily sample may not trigger an alarm, but if you trend and consolidate the results over many days you will be able to detect a problem. The sample size you need in a single day may never meet the need even with 100% sampling. Maybe you need to go 100% inspection if it is that critical.

#### scooon

##### Registered
True, a daily sample may not trigger an alarm, but if you trend and consolidate the results over many days you will be able to detect a problem. The sample size you need in a single day may never meet the need even with 100% sampling. Maybe you need to go 100% inspection if it is that critical.
My main problem is that we have a sampling plan, but no one knows who based the procedure on what. I am trying to reduce the size of the control sample. In this case, 100% inspection is not possible because the population is huge

#### Steve Prevette

##### Deming Disciple
Super Moderator
The main question (at least for a go no-go inspection plan) is how long can you go without determining there is a problem. You can look at the operating curve for the tests in terms of balancing failure to detect and false alarm rates. There was an effort in MIL STD 105 series (which has been superceded by an ISO document) to do that balancing based upon stated AQL and seriousness of the defect being tested for. There are also two-step sampling plans and other nuances available.

You might want to consider automation in the testing routine. The ability to automatically sort fruit is amazing.

One thing that CAN significantly cut down on sample size is to shift from go no-go to an actual measurement. Such as dimension, weight, resistance, whatever. With that you can then do SPC on the data and be able to detect a shift in the data much faster, and before it hits rejection criteria. But the reality of life (and statistics) is if you have a very low failure rate you are testing for, it will take a LOT of samples. If it is continuous sampling, you can stretch out the sampling, but that means a longer time to detect the problem.

#### scooon

##### Registered
The main question (at least for a go no-go inspection plan) is how long can you go without determining there is a problem. You can look at the operating curve for the tests in terms of balancing failure to detect and false alarm rates. There was an effort in MIL STD 105 series (which has been superceded by an ISO document) to do that balancing based upon stated AQL and seriousness of the defect being tested for. There are also two-step sampling plans and other nuances available.

You might want to consider automation in the testing routine. The ability to automatically sort fruit is amazing.

One thing that CAN significantly cut down on sample size is to shift from go no-go to an actual measurement. Such as dimension, weight, resistance, whatever. With that you can then do SPC on the data and be able to detect a shift in the data much faster, and before it hits rejection criteria. But the reality of life (and statistics) is if you have a very low failure rate you are testing for, it will take a LOT of samples. If it is continuous sampling, you can stretch out the sampling, but that means a longer time to detect the problem.
For this AQL criteria like 11-17 ppm it can be longer for example 1 month or 1 year.