D
debun
I experimented with using only the first 5 data points from a sample set of 26 to do a Weibull analysis. The objective was to see how well these first 5 points predicted the distribution of the entire set. The first 5 points fit a 2 parameter Weibull and gave an R^2 values of 0.99. The second data set (remaining 21 points) changes to a 3 parameter Weibull with an R^2 value somewhere in the neighborhood of 0.976. So my questions are
1) How does one determine the minimum sample size to use for a Weibull? There are equations for calculations the sample size for a mean e.g n = Z^2 * sigma^2 / sampling_error^2. This seems to work well when your standard deviation is small but with failure data the standard deviation is typically large and on occasion not normally distributed. This equals huuuuuuge sample sizes using this method. Should I use a proportions method? The sample size is still large.
2) I would like to plot the remaining 21 data points over the CDF calculated from my first 5 points. The median ranking has a huge effect on this, so what is the correct way to do this? I have used 2 methods. Beta cumulative probability density function (first step in calculating the MR for any data set) and the second using the regression values for the set of 21 data (used to calculate the MR). The data is closest to the 5 point CDF is the beta cumulative probability density function but I intuitively think the regression method makes the most sense since. Is there a way to do this?
Here is my data set.
154173
171158
83431
201778
117578
192083
136262
149487
148009
98317
69798
94195
62548
103574
108364
132377
143047
85272
95760
214166
289237
161265
172490
99972
117440
89717
1) How does one determine the minimum sample size to use for a Weibull? There are equations for calculations the sample size for a mean e.g n = Z^2 * sigma^2 / sampling_error^2. This seems to work well when your standard deviation is small but with failure data the standard deviation is typically large and on occasion not normally distributed. This equals huuuuuuge sample sizes using this method. Should I use a proportions method? The sample size is still large.
2) I would like to plot the remaining 21 data points over the CDF calculated from my first 5 points. The median ranking has a huge effect on this, so what is the correct way to do this? I have used 2 methods. Beta cumulative probability density function (first step in calculating the MR for any data set) and the second using the regression values for the set of 21 data (used to calculate the MR). The data is closest to the 5 point CDF is the beta cumulative probability density function but I intuitively think the regression method makes the most sense since. Is there a way to do this?
Here is my data set.
154173
171158
83431
201778
117578
192083
136262
149487
148009
98317
69798
94195
62548
103574
108364
132377
143047
85272
95760
214166
289237
161265
172490
99972
117440
89717