doops
7th January 2009, 06:55 PM
Please tell me how I've misused the binomial equation... or congratulate me for being a bright guy. Just kidding on the congrats for being a bright guy.
I have designed a new assembly. I want to develop the sample size for a test to uncover all unknown failure modes with a reliability less than 95%. For example, I may have several yet unknown failure modes like: leak failure mode with 90% failure rate, a overheat failure mode with 70% failure rate, and a rust failure mode with 11% failure rate, which I would like to discover.
I use the binomial equation by plugging in p=.05 (potential for a failure or 1-.95) n = 45 (sample size), and I get a cumulative probability of .90.
So I am making a statement that if I run a test with 45 samples, I have a 90% chance of all failure modes with a reliability of less than 95%, occurring on the this test. Is this a correct statement? Have I mis-used the binomial distribution?
Thanks!
reynald
8th January 2009, 12:30 AM
I use the binomial equation by plugging in p=.05 (potential for a failure or 1-.95) n = 45 (sample size), and I get a cumulative probability of .90.
So I am making a statement that if I run a test with 45 samples, I have a 90% chance of all failure modes with a reliability of less than 95%, occurring on the this test. Is this a correct statement?
I believe something is missing on your statement. It should be somewhat like this:
I use the binomial equation by plugging in p=.05 (potential for a failure or 1-.95) n = 45 (sample size), and I get a cumulative probability of .90 for having x or less than occurences of failure modes
and
So I am making a statement that if I run a test with 45 samples, I have a 90% chance of all having <=x number of failure modes with a reliability of less than 95%, occurring on the this test
since you are talking abot cumulative probability
..or maybe my understanding is incorrect
doops
8th January 2009, 09:41 AM
I think I've accounted for all the "less than x" issues by summing the individual probabilities.
I guess my question is have I statistically identified all possible failures... or do I have confidence that at least one failure mode will occur during the test? The same failure mode may occur several times during the test, and I may miss all the other potential failures.
Thanks!
p n x P(x)
0.05 45 1 0.235516398
0.05 45 2 0.272703198
0.05 45 3 0.205723465
0.05 45 4 0.113689283
0.05 45 5 0.049065901
0.05 45 6 0.017216106
0.05 45 7 0.005048332
0.05 45 8 0.001262083
0.05 45 9 0.000273082
0.05 45 10 5.17419E-05
0.05 45 11 8.66491E-06
0.05 45 12 1.29214E-06
0.05 45 13 1.72634E-07
0.05 45 14 2.07679E-08
0.05 45 15 2.25897E-09
0.05 45 16 2.22925E-10
0.05 45 17 2.00149E-11
0.05 45 18 1.63865E-12
0.05 45 19 1.22558E-13
0.05 45 20 8.38555E-15
0.05 45 21 5.25411E-16
0.05 45 22 3.01671E-17
0.05 45 23 1.58774E-18
0.05 45 24 7.66016E-20
0.05 45 25 3.3866E-21
0.05 45 26 1.37109E-22
0.05 45 27 5.07812E-24
0.05 45 28 1.71816E-25
0.05 45 29 5.30104E-27
0.05 45 30 1.48801E-28
0.05 45 31 3.7895E-30
0.05 45 32 8.72583E-32
0.05 45 33 1.80918E-33
0.05 45 34 3.36071E-35
0.05 45 35 5.55907E-37
0.05 45 36 8.1273E-39
0.05 45 37 1.04048E-40
0.05 45 38 1.15289E-42
0.05 45 39 1.0891E-44
0.05 45 40 8.59812E-47
0.05 45 41 5.51869E-49
0.05 45 42 2.76626E-51
0.05 45 43 1.01576E-53
0.05 45 44 2.43006E-56
0.05 45 45 2.84217E-59
sum of probs 0.90
reynald
8th January 2009, 11:30 PM
Ok now i think i understand what you are thinking.
2 things to point out.
1. You forgot to account for x=0
2. The sum of all P(x) should be 100%
I have attached an excel file that hope helps.
doops
9th January 2009, 12:15 PM
reynald,
Thank you for the excel file! I think we're on the same page. I didn't sum x=0, because I only wanted probs where there was a failure. If I were smarter, I would've only calculated the prob for x=0 and subtracted from 1.
Please allow me to rephrase my original question. Lets say I have a really large bag of marbles of different colors. I don't know how many colors there are, nor the probability distributions of the marble colors in the bag. How many marbles do I have to sample to have 90% confidence that I have 90% of all the colors by count? For example, the sum of red + blue marbles may make up 90% of the marbles by count, or red + blue + green + yellow marbles make up 90% of the bag. I want to discover how many colors of marbles make up 90% of the bag of marbles, by sampling a small set of marbles.
I think I am misusing the binomial equation to get this type of answer. Would you suggest a methodology to answer my question? Am I asking an impossible question?
Thanks again for taking time to answer my question! I really appreciate it!
mhess
11th January 2009, 01:32 AM
I believe the response on how to calculate the sample size for your problem is given online, in the NIST statistical handbook (sorry, forum is not accepting my posting of the link):
(NIST handbook section copied below):
7.2.4.3.
Sample sizes required
Derivation of formula for required sample size when testing binomial proportions
Although the sampling distribution for proportions actually follows a binomial distribution, the normal approximation is used for this derivation.
If we are interested in detecting a change in the proportion defective of size delta in either direction, the corresponding confidence interval for p can be written
phat - delta <= p <= phat + delta
For a (1-alpha)% confidence interval based on the normal distribution, where z(alpha/2) is the upper critical value of the normal distribution which is exceeded with probabilityalpha/2,
delta = SQRT(p*(1-p)/N)*z(alpha/2)
Thus, the minimum sample size is
1. For a two-sided interval
N >= [p(1-p)/delta**2]*z(alpha/2)**2
2. For a one-sided interval
N >= [p(1-p)/delta**2]*z(alpha)**2
This requirement on the sample size only guarantees that a change of size delta is detected with 50% probability.
The derivation of the sample size when we are interested in protecting against a change delta with probability 1 -beta (where beta is small) is
1. For a two-sided interval
N >= (z(alpha/2) + z(beta))**2*[p(1-p)/delta**2]*z(alpha/2)**2
2. For a one-sided interval
N >= (z(alpha) + z(beta))**2*[p(1-p)/delta**2]*z(alpha)**2
where z(beta) is the upper critical value from the normal distribution that is exceeded with probability beta.
The equations above require that p be known. Usually, this is not the case. If we are interested in detecting a change relative to an historical or hypothesized value, this value is taken as the value of p for this purpose. Note that taking the value of the proportion defective to be 0.5 leads to the largest possible sample size.
Example of calculating sample size for testing proportion defective Suppose that a department manager needs to be able to detect any change above 0.10 in the current proportion defective of his product line, which is running at approximately 10% defective. He is interested in a one-sided test and does not want to stop the line except when the process has clearly degraded and, therefore, he chooses a significance level for the test of 5%. Suppose, also, that he is willing to take a risk of 10% of failing to detect a change of this magnitude. With these criteria:
1. z.05 = 1.645; z.10=1.282
2. delta = 0.10
3. p = 0.10
and the minimum sample size for a one-sided test procedure is
N >= [p(1-p)/delta**2]*[z(.05)+z(.10)]**2 = 0.10*0.90*2.927**2/0.10**2 is approximately 77
doops
14th January 2009, 05:20 PM
Thank you for your reply! I have spent the last several days reading the NIST site!