# What is the minimum Sample Size for Weibull Analysis

D

#### debun

I experimented with using only the first 5 data points from a sample set of 26 to do a Weibull analysis. The objective was to see how well these first 5 points predicted the distribution of the entire set. The first 5 points fit a 2 parameter Weibull and gave an R^2 values of 0.99. The second data set (remaining 21 points) changes to a 3 parameter Weibull with an R^2 value somewhere in the neighborhood of 0.976. So my questions are

1) How does one determine the minimum sample size to use for a Weibull? There are equations for calculations the sample size for a mean e.g n = Z^2 * sigma^2 / sampling_error^2. This seems to work well when your standard deviation is small but with failure data the standard deviation is typically large and on occasion not normally distributed. This equals huuuuuuge sample sizes using this method. Should I use a proportions method? The sample size is still large.

2) I would like to plot the remaining 21 data points over the CDF calculated from my first 5 points. The median ranking has a huge effect on this, so what is the correct way to do this? I have used 2 methods. Beta cumulative probability density function (first step in calculating the MR for any data set) and the second using the regression values for the set of 21 data (used to calculate the MR). The data is closest to the 5 point CDF is the beta cumulative probability density function but I intuitively think the regression method makes the most sense since. Is there a way to do this?

Here is my data set.
154173
171158
83431
201778
117578

192083
136262
149487
148009
98317
69798
94195
62548
103574
108364
132377
143047
85272
95760
214166
289237
161265
172490
99972
117440
89717

#### Steve Prevette

##### Deming Disciple
Staff member
Super Moderator
There is a reasonably good writeup on Weibull on wikipedia at http://en.wikipedia.org/wiki/Weibull_distribution

We've previously had discussions on the Cove about fitting the Weibull distribution to data. Unfortunately the factors need to be solved for in an optimization, which yes can be done in Excel, but is not a straight forward formula.

Just as it I had two points, plotted them, and fit a line, I'd get a 100% correlated fit, the same happens with the Weibull with only five points.

I think you should move ahead with your simulation and get a feel for yourself how Weibull behaves. My personal judgement call would be I'd like at least a dozen points to get a fairly good fit on a Weibull.

#### Miner

##### Forum Moderator
Staff member
What estimation method are you using? Least Squares or Maximum Likelihood? LSR is better for small sample sizes. MLE should have 100 or more failures, but no less than 50.

Regarding the 3-parameter Weibull, this should not be used unless there is a physical reason why there should be a threshold parameter. That is, if you have a threshold parameter of 3 months, is there a physical reason why you CANNOT have a failure in less than 3 months. If there is no physical reason, use of the 3-parameter Weibull is risky.

#### Statistical Steven

##### Statistician
Staff member
Super Moderator
Regarding the 3-parameter Weibull, this should not be used unless there is a physical reason why there should be a threshold parameter. That is, if you have a threshold parameter of 3 months, is there a physical reason why you CANNOT have a failure in less than 3 months. If there is no physical reason, use of the 3-parameter Weibull is risky.
Thank you! It can never be said enough that just blindly applying a distribution to data without a physical reason is very risky!

D

#### debun

What estimation method are you using? Least Squares or Maximum Likelihood? LSR is better for small sample sizes. MLE should have 100 or more failures, but no less than 50.

Regarding the 3-parameter Weibull, this should not be used unless there is a physical reason why there should be a threshold parameter. That is, if you have a threshold parameter of 3 months, is there a physical reason why you CANNOT have a failure in less than 3 months. If there is no physical reason, use of the 3-parameter Weibull is risky.

I used least squares to do my regression. The reason for 3 parameter was based upon my R^2 value as I stated. Are you implying that that decision should be based on some other factor/factors? This is also the reason that the first 5 were 2 param, and the remaining 3. I am only looking at the fit of the data.

Last edited by a moderator:

#### Miner

##### Forum Moderator
Staff member
More than implying... Distribution fitting is not easy. There is a lot of engineering judgment that goes with it. I use the following approach (in Minitab):
1. Use the probability plots to rule out obviously poor fits (i.e., curves and doglegs that may indicate mixtures of different failure modes, which must be modeled separately)
2. Review remaining distributions to identify the higher level distributions (e.g., 3-parameter Weibull), and determine both whether there this a physical reason for a location parameter (as I discussed previously) AND whether the additional parameter is significantly better than the base distribution. I usually eliminate the higher level distributions based on this.
3. Review the remaining distributions with highest equivalent fits, looking at the means and standard errors. You will typically be able to eliminate 1 or 2 more because these values are obviously very poor, usually extremely high).
4. I usually end up with 2, sometimes 3, distributions remaining that are essentially equivalent. I then look at what type of failure mode each is typically used to model and select based on that.
BTW, I split your data set and performed a test of whether the shape and scale parameters were statistically different, and the p values were > 0.4. The confidence intervals on a sample of 5 were quite wide.

J

#### JayWarner

The 'answers' and suggestions for this question are very, very good - especially:Include the engineering aspects in the analysis decisions. I would only add, "Plot the *&^\$ data!" If the data clearly doesn't fit the assessed curve fit well, something is wrong. Weibull analysis can be prone to this problem. As for comparing some of the data against the rest, I'd suggest that you take a structured sample - every 2nd or third observation in one group vs. the others. Taking the first 5 against the others leaves you wondering if perhaps the engineers & techs who did the work, worked out a few kinks in those first measurements, and they don't really reflect the defined population of the later data.

D

#### debun

More than implying... Distribution fitting is not easy. There is a lot of engineering judgment that goes with it. I use the following approach (in Minitab):
1. Use the probability plots to rule out obviously poor fits (i.e., curves and doglegs that may indicate mixtures of different failure modes, which must be modeled separately)
2. Review remaining distributions to identify the higher level distributions (e.g., 3-parameter Weibull), and determine both whether there this a physical reason for a location parameter (as I discussed previously) AND whether the additional parameter is significantly better than the base distribution. I usually eliminate the higher level distributions based on this.
3. Review the remaining distributions with highest equivalent fits, looking at the means and standard errors. You will typically be able to eliminate 1 or 2 more because these values are obviously very poor, usually extremely high).
4. I usually end up with 2, sometimes 3, distributions remaining that are essentially equivalent. I then look at what type of failure mode each is typically used to model and select based on that.
BTW, I split your data set and performed a test of whether the shape and scale parameters were statistically different, and the p values were > 0.4. The confidence intervals on a sample of 5 were quite wide.
I'm not struggling with what distribution to use. My question still is how to determine the sample size since if you look at the first 5 alone the fit is good. How do you compare the first 5 to the remaining 21 since you have to determine the median ranking.

D

#### debun

What is the minimum sample size to generate a Weibull distribution?

One of the advantages of the Weibull is you can form a distribution with a much smaller sample size than say a histogram.
I experimented with using only the first 5 data points from a sample set of 26 to do a Weibull analysis. The first 5 points fit a 2 parameter Weibull and gave an R^2 values of 0.99. The second data set (remaining 21 points) changes to a 3 parameter Weibull with an R^2 value somewhere in the neighborhood of 0.976. So my questions are

1) What is the minimum sample size you need to accurately represent the population? For example from this sample size X the PDF is within some metric (std deviations, % etc) of the true PDF. Clearly R^2 value alone isn?t a good metric

2) I would like to plot the remaining 21 data points over the CDF calculated from my first 5 points. I have used 2 methods. Simply calculating the MR and plotting test cycle vs MR. The second using the Beta and Eta to calculate the CDF percent and plotting test cycle vs CDF percentile. Is there a good way to represent predicted CDF vs test data?

Here is my data set.
154173
171158
83431
201778
117578

192083
136262
149487
148009
98317
69798
94195
62548
103574
108364
132377
143047
85272
95760
214166
289237
161265
172490
99972
117440
89717

Minimum sample size - Guidance and statistical rationale Inspection, Prints (Drawings), Testing, Sampling and Related Topics 3
J Minitab Sample Size Calculation: 1 Sample t vs Minimum Sample Size for Means Using Minitab Software 3
D Minimum Size of Control Sample in Medical Research Lab Work Statistical Analysis Tools, Techniques and SPC 2
R What is minimum sample size to calculate Cp? Statistical Analysis Tools, Techniques and SPC 9
ETO Sterilisation Validation - EO Residual Minimum Sample Requirement ISO 13485:2016 - Medical Device Quality Management Systems 2
B What is the minimum sample required for Performance Evaluation for IVD ? EU Medical Device Regulations 3
Minimum Distance from Floor to Underneath of Machine Human Factors and Ergonomics in Engineering 3
LiPo battery minimum requirements IEC 60601 - Medical Electrical Equipment Safety Standards Series 2
Search function minimum length Elsmar Cove Forum Suggestions, Complaints, Problems and Bug Reports 3
Capability study with a minimum spec Statistical Analysis Tools, Techniques and SPC 8
Significant Production Run - How 300 was determined to be the minimum quantity APQP and PPAP 2
MMC & LMC modifiers and CMM measuring techniques like diameter least squares and circularity or minimum/ maximum diameter Calibration and Metrology Software and Hardware 5
Minimum 60601 requirements for a clinical trial IEC 60601 - Medical Electrical Equipment Safety Standards Series 3
Minimum Supplier Certifications for Food Supplements Food Safety - ISO 22000, HACCP (21 CFR 120) 4
What is the minimum standard for automotive component traceability? Manufacturing and Related Processes 2
Applicability of IEC 60950-1 Table K2 - Minimum clearances between circuits Various Other Specifications, Standards, and related Requirements 0
M Characterization Testing - NO acceptance criteria, no minimum performance requirement Inspection, Prints (Drawings), Testing, Sampling and Related Topics 1
K Is minimum repair or replace guarantee period for medical devices in EU the same as consumer guarantee law of 2 years? EU Medical Device Regulations 5
S Designing the experiment with minimum run Statistical Analysis Tools, Techniques and SPC 0
S Designing the experiment with minimum run Statistical Analysis Tools, Techniques and SPC 0
Change Control - Minimum Requirements and Unhappy Staff ISO 13485:2016 - Medical Device Quality Management Systems 16
J IATF Minimum Automotive Quality Management System Requirements for Sub-Tier Suppliers IATF 16949 - Automotive Quality Systems Standard 12
Regulatory Compliant Quality Management System - Bare Minimum EU Medical Device Regulations 2
What is the minimum pressure differential required for ISO Class 8 Clean Room? ISO 13485:2016 - Medical Device Quality Management Systems 3
ISO 13485 plus ISO 9001:2015 will now attract a minimum 2 day upgrade audit ISO 13485:2016 - Medical Device Quality Management Systems 9
J Minimum staff per 21 CFR Part 820 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 2
Is there a minimum industry standard for PpK ? Statistical Analysis Tools, Techniques and SPC 8
K What are the minimum requirements for Process Validation (Software)? ISO 13485:2016 - Medical Device Quality Management Systems 5
C Battery Powered Beauty Products minimum Legal Certifications Requirements CE Marking (Conformité Européene) / CB Scheme 7
D Minimum requirements/experience/qualifications to be head of quality Career and Occupation Discussions 6
R Minimum Essential Receiving Inspection (M.E.R.I.) Inspection, Prints (Drawings), Testing, Sampling and Related Topics 3
S What is the minimum samples required for Customer Survey? Quality Manager and Management Related Issues 3
J Minimum number of parts to do an attribute gage R&R Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 8
Minimum and Maximum Validation Batch Size Requirements Qualification and Validation (including 21 CFR Part 11) 3
A Minimum Number of Parts for Attribute (Go / NoGo) Data Gage R&R Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 4
Minimum documents required by AS9100 AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 3
N Minimum Design factor for CE Marking for machinery CE Marking (Conformité Européene) / CB Scheme 2
G Why for a Shewhart chart the minimum feasible value for APL is 2401 units? Statistical Analysis Tools, Techniques and SPC 8
A Minimum Testing Requirements for a Clear Protective Equipment Barrier (Class II 510K) Other US Medical Device Regulations 1
V Critical Characteristics - Capability for a Minimum Specification APQP and PPAP 11
What is the minimum size chamber for mapping? ISO 13485:2016 - Medical Device Quality Management Systems 2
Minimum Organization Structure to support TS16949 IATF 16949 - Automotive Quality Systems Standard 5
Q Minimum content required in a Quality Manual as per ISO 9001:2008 Quality Management System (QMS) Manuals 6
A ISO 13485 Supplier Monitoring Minimum Requirements Supplier Quality Assurance and other Supplier Issues 3
T New Packaging Line Performance Qualification (PQ) - Minimum Test Times ? Qualification and Validation (including 21 CFR Part 11) 3
F How to calculate the minimum Value of Mass needed for Scales Calibration? General Measurement Device and Calibration Topics 6
W ITAR (International Traffic in Arms Regulations) Minimum Requirements or Shalls Other ISO and International Standards and European Regulations 9
Equipment Guarding - Safe Distance - Is there a specified minimum distance? Occupational Health & Safety Management Standards 6
F How to Calculate Cpk with only a Minimum Tolerance Capability, Accuracy and Stability - Processes, Machines, etc. 11
H SOP containing Minimum Requirements for Master Production Record US Food and Drug Administration (FDA) 1