Data normality versus capability

Dan Watson

Starting to get Involved
I feel rather stupid for asking this question, but it is a reality check for me. I work with a QA Manager who has stated that the normality of the data does not matter when calculating capability indices for different product parameters. His numbers are pretty much always above 1.33. I thought, since I have been taught since 1987, that if the data are non-normally distributed, that the descriptive statistics for normally distributed data cannot be used. The root cause of why the data are non-normally distributed needs to be investigated, i.e., what are the causes of the special variation influencing the data. Am I wrong? Would process capability indices be an alternative measure until the data distribution is more normal? This individual will rework PPAP numbers that the technicians give to him to that the numbers "look appropriate." Sometimes the old saying is true, 'torture the numbers enough and they'll confess to anything."
 

John C. Abnet

Teacher, sensei, kennari
Leader
Super Moderator
i.e., what are the causes of the special variation influencing the data. Am I wrong?

You are indeed not wrong. Until special causes of variation are removed, the process is unstable and process improvements can not properly be applied and the capability data results, therefore, are not valuable to you.

Hope this helps.
Be well.
 

Miner

Forum Moderator
Leader
Admin
I feel rather stupid for asking this question, but it is a reality check for me. I work with a QA Manager who has stated that the normality of the data does not matter when calculating capability indices for different product parameters. His numbers are pretty much always above 1.33. I thought, since I have been taught since 1987, that if the data are non-normally distributed, that the descriptive statistics for normally distributed data cannot be used. The root cause of why the data are non-normally distributed needs to be investigated, i.e., what are the causes of the special variation influencing the data. Am I wrong? Would process capability indices be an alternative measure until the data distribution is more normal? This individual will rework PPAP numbers that the technicians give to him to that the numbers "look appropriate." Sometimes the old saying is true, 'torture the numbers enough and they'll confess to anything."

It depends on why the data are not normal. There are a number of reasons and each requires a specific response:
  • Not-normal due to special causes. The process is not stable and therefore is not predictable. Capability indices only hold true for that specific set of data and will not apply to future data sets. The special causes must be identified and eliminated to stabilize the process and make the indices meaningful.
  • Not-normal due to mixtures. The process may be stable, but is a mixture of different process streams that result in a multi-modal or uniform distribution. The process streams must be separated and the capability of each determined.
  • Not normal due to tool wear. The process is stable, but trends due to tool wear resulting in a quasi-uniform distribution. Use alternate methods for non normal capability such as CNp/CNpk or PPM.
  • Not normal due to the process itself. Certain processes are inherently non normal and will always be so in the absence of special causes. Use alternate methods for non normal capability such as CNp/CNpk or PPM.

Process capability/performance indices formula are for normal data. If your data is not normal, data needs to be transformed and if the transformed data is normal, it can be used for process capability/performance calculation.

Transforming data into normality is ONE way of dealing with the fourth scenario above (never for the first three scenarios), but is not the only approach. It is often overused in inappropriate situations (first three scenarios) by people that do not understand what they are doing.

I personally do not like to transform data because you lose a lot of information in the data. You also transform the specifications and it becomes very difficult to explain to non statisticians. I prefer to perform a non normal capability analysis. It is relatively easy for others to interpret and retains all of the information in the data.
 

Matt Savage

Trusted Information Resource
... His numbers are pretty much always above 1.33 ...
Does this mean his Ppk number is above 1.33?
Ignoring the question about normal or not, transpose the data or not, the 'customer' is looking for consistency (in statistical control) and on the desired Target.
The control charts show that all three batches are not stable, with Batch #2 being the least predictable. (See attached.) If the processes are not in statistical control, by what means is acceptability determined?
 

Attachments

  • Batch Data.png
    Batch Data.png
    91 KB · Views: 38

Dan Watson

Starting to get Involved
Does this mean his Ppk number is above 1.33?
Ignoring the question about normal or not, transpose the data or not, the 'customer' is looking for consistency (in statistical control) and on the desired Target.
The control charts show that all three batches are not stable, with Batch #2 being the least predictable. (See attached.) If the processes are not in statistical control, by what means is acceptability determined?

Hi Matt. His "1.33" is the Cpk index. He does not control chart but uses a histogram and formats it so that it will appear normal. The"acceptability" is that the data are within the specification limits, not being in statistical control with reviewing sources of special variation. It is purely a numbers game for the customer. And, yes, we have had complaints of consistency from are large customers.
 

Matt Savage

Trusted Information Resource
Hi Matt. His "1.33" is the Cpk index. He does not control chart but uses a histogram and formats it so that it will appear normal. The"acceptability" is that the data are within the specification limits, not being in statistical control with reviewing sources of special variation. It is purely a numbers game for the customer. And, yes, we have had complaints of consistency from are large customers.
Hi Dan, If a control chart is not used or used but shows out-of-control conditions, I would not put much emphasis on Cpk. (One of the pre-requisites for Cpk to be valid is that the data be in statistical control.) I prefer to evaluate Ppk instead of Cpk in situations like this since it looks at the within subgroup and between subgroup variation. Some will argue that any statistic is bogus when the chart shows out-of-control conditions.
 

Steve Prevette

Deming Disciple
Leader
Super Moderator
It is important to note that there are some processes / data sources that are NOT normal and there is nothing wrong with that. If I am measuring the strength of a steel beam, it can't have a strength below zero. So it is NOT normal. Now, sometimes normal is good enough. But it may come into play if zero is within a few standard deviaitions of the sample average.

As a reminder - Statistical Process Control does not require normality. So I may have a non-normal situation (log normal as above, Poisson counting events) but it is stable and "in control".

However, we tend to overrely on ppk and cpk, in my opinion. I'll agree with Matt Savage above - understand what is happening on a control chart.
 

Semoi

Involved In Discussions
Reading Minors answer above I hope it is obvious that the
QA Manager who has stated that the normality of the data does not matter when calculating capability indices
is wrong. However, giving people the credit they you might have misunderstood his/her argument and thus misquoted it here, you should probably talk to the QA manager and clarify.
Although Minors argument should be pretty self-explanatory, you could also look into ISO 22514-4:2016. In section 4.4.3 it contains a method for non-normal data: The key idea is to use quantiles instead of the standard deviation. In statistics we call this a non-parametric method. It is commonly accepted that non-parametric methods are more robust (against outliers and other assumption violations) compared to parametric methods, but they have a slightly lower (Pitman) efficiency. Not going into statistical details, I believe it is obvious that normality matters, if an ISO norm contains a section for capability index calculation for non-normal data.
 

Matt Savage

Trusted Information Resource
In today's Quality Digest, Wheeler's article may shed some additional light on this topic. Analyzing Observational Data

In the article, Wheeler states: "... You do not need to fit a probability model to your data. Neither should you place your data on a normal probability plot. And you certainly do not need to transform your data to make them “more normal.” All of these prequalification activities assume the process is already being operated predictably. Assignable causes completely undermine this assumption, making these activities nonsense.

So regardless of what your histogram may look like, put your data on a suitable process behavior chart and characterize your process behavior as predictable or unpredictable. Use the data to make predictions for your predictable processes, and use the chart to look for the assignable causes that are taking your unpredictable processes on walkabout."
 
Thread starter Similar threads Forum Replies Date
D Transformation of Data Normality Failed Using Minitab Software 11
J What can I do to give the Pp and Cp when data is Non normality data Quality Tools, Improvement and Analysis 11
C Data Normality using Anderson Darling (Template) Document Control Systems, Procedures, Forms and Templates 1
Hami812 Normality of Data Question - Minitab Analysis Statistical Analysis Tools, Techniques and SPC 12
M Minitab 16 doesn't like perfect data for Cpk & Normality? Using Minitab Software 5
R Testing Data for Normality - Need help on my essay Statistical Analysis Tools, Techniques and SPC 9
D Normal Distribution - How do I test for normality? All the data or just the averages? Statistical Analysis Tools, Techniques and SPC 8
Q Data trends Capability, Accuracy and Stability - Processes, Machines, etc. 7
R Data Analysis Software classified as MDSW IVD? EU Medical Device Regulations 3
G Trend analysis according to Article 88, MDR (EU), type of data point. EU Medical Device Regulations 1
Ed Panek Patient consent over data post processing in USA US Food and Drug Administration (FDA) 0
A CE Mark - How do you keep data and test reports? CE Marking (Conformité Européene) / CB Scheme 2
Z Shelf life automatically set to 6 months when no device performance shelf life data provide Medical Device and FDA Regulations and Standards News 5
E-QCDA Extracting data from Word Doc table to Excel Quality Tools, Improvement and Analysis 6
A Sample size selection for process validation - continuous data Reliability Analysis - Predictions, Testing and Standards 9
T CMM Max/Min data and Capability Capability, Accuracy and Stability - Processes, Machines, etc. 3
G Record test data into Word document Document Control Systems, Procedures, Forms and Templates 3
Stoic Warning letter examples for medical device companies related to the pharma guidance on data integrity? US Medical Device Regulations 5
C Primary data record ISO 17025 related Discussions 6
H Is it a requirement for run charts to have inspection data or can it have just a pass/fail check mark Records and Data - Quality, Legal and Other Evidence 4
I Brazil clinical data/trial requirement Other Medical Device Regulations World-Wide 1
A Part 145 Maintenance Data Review EASA and JAA Aviation Standards and Requirements 1
E Electronic Data Management ISO 17025 related Discussions 1
D ISO 14001 Finding - Missing Safety Data Sheets ISO 14001:2015 Specific Discussions 2
P Comparing Two Test Variables Using Attribute Data Inspection, Prints (Drawings), Testing, Sampling and Related Topics 0
K Before-After Data Analysis Statistical Analysis Tools, Techniques and SPC 1
D Gage type and data base maintainence Using GAGEpack Software 2
Dazzur Sharing Suppliers Performance Data with Supplier. Supplier Quality Assurance and other Supplier Issues 6
M Conducting a clinical investigation with clinical data from India EU Medical Device Regulations 3
T Data types vs Mathematical operations Six Sigma 4
T Gage R&R study - Ordinal data Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 6
optomist1 Data Bias - Surveys Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 0
V Levels of actions and consequence to data integrity US Food and Drug Administration (FDA) 10
M ADME data- substances based MD EU Medical Device Regulations 0
B Can a software that receive data from a MD be classified as Class I?or is not a MD? EU Medical Device Regulations 5
T Process Potential estimation for binary data Capability, Accuracy and Stability - Processes, Machines, etc. 3
RoxaneB Data Storytelling Misc. Quality Assurance and Business Systems Related Topics 4
MaHoDie Summative Evaluation with Post-Market Data? Human Factors and Ergonomics in Engineering 2
J EU Data Act Medical Information Technology, Medical Software and Health Informatics 0
P Transferring medical data from a device (Sec 201(h)): regulatory implications US Medical Device Regulations 3
Z Change color or shape of individual data point in control chart Using Minitab Software 6
R FDA ECG Data Requirements Medical Information Technology, Medical Software and Health Informatics 3
T SQL Server 2019 - Master Data Services - Validation needed? ISO 13485:2016 - Medical Device Quality Management Systems 4
C Elaborating a control chart with skewed data Manufacturing and Related Processes 4
I In-Process Inspection Raw Data ISO 13485:2016 - Medical Device Quality Management Systems 3
T Class III device and shelf life data requirements US Medical Device Regulations 7
S Discussion on OBL and OEM test data for submission as per new EUMDR EU Medical Device Regulations 4
C How to place software version for SaMD product in HIBC secondary data structure (UDI-PI)? Other US Medical Device Regulations 4
PQ Systems Better Data Visualization & Communication with Statistical Indices Using SQCpack Software 0
PQ Systems Data Entry Workflows with SQCpack Using SQCpack Software 2

Similar threads

Top Bottom