How to deal with nonrandom samples?


KenK - 2009

What do you want to do with them?

Most common statistical methods assume that data have been aquired randomly from their population. If they are not, a managerie of possible problems can emerge.


I have a situation where sampling was selected by rank ordering the hardware for a characteristics perceived to be the influencing variable and testing only n worst samples based on 105. The assumption was that at 1% AQL 0 failures on the test would depict acceptance of the lot. Of course, the test failed. Now the situation is to accept the lot at a higher AQL (<4%) or do additional testing. How many more to test for zero failures so that the lot can be certified to be 99% good with some (95%) confidence. Because of the bias, I tests show no correlation of test to assumed influencing variable indicating the presence of other factors. The idea is to avoid 100% inspection which may be the way to go with the high degree of confidence required. Any ideas?

Rick Goodson


I may not understand the situation, but it appears that you have decided on a test to determine whether a lot of material is acceptable for shipment. The lot has failed so you now want to change the parameters of the test so that the lot will pass. If that is the case, it is relatively simple to determine what the sampling size and acceptance number should be to pass the lot. IMHO however, you should decide whether or not this is really what you want to do.

If you want a sampling plan that provides 95% confidence that the lot of material is no worse than 1% 'defective' you can use a standard sampling scheme or design a custom sampling scheme to provide that protection. Either will require you test a number of samples randomly selected from the lot. If you want to estimate the the number of defectives in a lot of material at some confidence level, you can also do that. Similarly it will required a number of random samples selected from the lot. Any good text on statistics/sampling can help you with either of the options I mentioned.


Yes, your observations are correct to the extent that the acceptance parameters can be changed to pass a test. I can go from an AQL of 1% to 4% under 105 and pass the lot. There will be several issues in using such approaches, least of which is a cloud on already passed lots. Further analysis showed that the lack of correlation and covariance is due to the early samples under going additional influencing variables. When these were removed from the analysis, the rest tracked what we perceived to be a better correlation. I am more interested in learning how the statisticians would handle non-random situations. The approach I intend to use is to increase the sample size dropping the early non-conforming items and redo the math. Further testing will show if our influencing variable is a dependable measure. I did not see any mathematical techniques in the books I looked at or on the web. I have been researching this over past three days. Thanks for your input.
Top Bottom