My query is around stratified random sampling calculation.

In a process i am working with we get 3 work types - Simple, Medium and Complex. Simple volume is 90%, medium 6% and complex 4%.

Our client selects a 5% sample for inspection and accordingly gives us our accuracy scores. But the sample they select does not have volume ratio in the same proportion as the population. In the last 3 months it has been Simple - 20%, medium - 45%, complex - 35%.


1) Is it right to use sample size calculator for each of the work types separately? - 5% for each work type?

2) How can sample size be calculated for stratified data like this?

3) For a simple sample size calculator(discrete or continuous) how does one decide on margin of error? For instance, if my accuracy target is 98% for the process does it mean my margin of error in calculation would be +/-2%?


First have you asked the customer for their sampling plan? I could be they are indeed doing a stratified sample, but since the volume for the complex items is low, and perhaps they view them as higher risk, they are purposely pulling a stratified sample that gets a higher proportion of the complex and medium than the general population. They may also believe that if you are doing well on the complex, you probably can handle the simple.

ASSUMING that a failure of a simple, medium, and complex is of equal severity of impact to the customer, then a pure random sample of all would make more sense. A pure random sample of the whole production would reflect the proportions of the population in the long run.

"Confidence" and "Failure Rate" are not necessarily connected. I may want a 95% confidence that no more than 2% of the production fails. Generally sampling plans are a balance of risk of failing to detect a problem exists, giving a false alarm that a problem exists, and the cost of doing the sampling.
