# Normality of Data Question - Minitab Analysis

#### Hami812

##### Engineering Program Mg
I have a data set which I looked at via Minitab v16. I want to find out exactly what type of data I am looking at. When I do the normality test in Minitab, it shows P value as less than .05. According to my training if P=.05 or less the data is not normal gausian distribution. I realize there are many different types of data such as Exponential, Poisson, Gaussian, etc. How do I characterize what type of data seti I am dealing with? When I want to find out I-IM value and use Box-Cox transformation, it works fine on the MR graph, but I want ot explain to someone else I am trying to convince in Senior Management what it is we are really doing with the data conversion and why its not normal in the 1st place.

If you look at any of the data sets attached, they all seem to be non-gaussian if I am reading this correctly. Any suggestion. Did I do something wrong? Should I take absolute value instead of leaving negatives values in the worksheets?

#### Attachments

• 27.5 KB Views: 177

#### bobdoering

Trusted Information Resource
Re: Question on Normality

What you have is discrete failure modes that are not directly related. There is no reason to believe that they would be "normal", as: first, they are sorted high to low (by design, that is the Pareto technique), which would give you a skewed distribution, and second, there is no "correct" sorting of this data. Now, for either the total failures over time or the failures over time of any individual cause, you might find normality.

#### Jim Wynne

Re: Question on Normality

I have a data set which I looked at via Minitab v16. I want to find out exactly what type of data I am looking at. When I do the normality test in Minitab, it shows P value as less than .05. According to my training if P=.05 or less the data is not normal gausian distribution. I realize there are many different types of data such as Exponential, Poisson, Gaussian, etc. How do I characterize what type of data seti I am dealing with? When I want to find out I-IM value and use Box-Cox transformation, it works fine on the MR graph, but I want ot explain to someone else I am trying to convince in Senior Management what it is we are really doing with the data conversion and why its not normal in the 1st place.

If you look at any of the data sets attached, they all seem to be non-gaussian if I am reading this correctly. Any suggestion. Did I do something wrong? Should I take absolute value instead of leaving negatives values in the worksheets?
What are you trying to learn from the data? It appears to me that you have a lot of cumulative numbers in a lot of different categories, and it might be better to examine each category individually in order to learn something about distributions. Trying to fit aggregate data to a curve might yield unexpected results if the nature of the underlying data sets isn't understood.

#### Bev D

##### Heretical Statistician
Super Moderator
agree with Bob and Jim: it looks like you are trying to do something with this data that shouldn't be done.

In other words either your statistical analysis isn't appropriate for the question you are really tring to answer, or your data set isn't appropriate for the question you are really trying to answer.

it looks like you may be trying to see if Mexico has a different failure rate than USA?

N

#### NumberCruncher

Hi Hami812

What exactly are you trying to ascertain from your data?

It looks like you are trying to compare failure rates and decide which country has the greatest number of failures.

If I have interpreted your data correctly, for each country:
(total pass + total failure) = total parts tested

If this is the case then you are fundamentally doing the wrong analysis in simply comparing numbers. You don't always have the same number of tested items in each category for each country.

For example, HDD test for both USA and Mexico, the total number of items is 606.
However, for Boot Software Upgrade, USA = 1200 items, Mexico = 1176.

Comparing absolute numbers becomes less and less meaningful the greater the difference in the number of items tested.

For the example of Boot Software Upgrade, if you had a 50% failure rate for both countries, USA would have 600 fails but Mexico would have 588 fails. So the USA has a greater number of fails. Except that it doesn't. The percentage failure rate is the same.

If this is what you are trying to do, I would suggest comparing % failure rate in each category rather than comparing absolute numbers, and forget about normality of the data.

NC

W

#### WKHANNA

The question I would ask, taking for granted a great deal of the OP’s specific issue, is if there is MSA data on the test equipment used at both locations. Not to mention calibration & training records.

#### Hami812

##### Engineering Program Mg
agree with Bob and Jim: it looks like you are trying to do something with this data that shouldn't be done.

In other words either your statistical analysis isn't appropriate for the question you are really tring to answer, or your data set isn't appropriate for the question you are really trying to answer.

it looks like you may be trying to see if Mexico has a different failure rate than USA?
NumberCruncher,

An Attribute Analysis was already performed on the differences between Mex and US in a previous worksheet used to compare Passes with Failures between 2 sites using the same tools. Between appraisers showed in this Gage R&R (attribute study only) an 84% agreement with US for 2 out of 3 benchs when 200 units of product were tested at 3 benchs (2 times for each unit product) for a total of 1200 tests. the 3rd bench showed a 97% agreement with US. My question was how I can build the case to begin using Statistical Process Control by starting with Attribute Analysis, showing there is a delta between and within appraisers being less than 90% (cust expects 95% agreement between and within appraisers). So first I was trying to show them there is a mismatch, next step was to show them that in order to keep both locaiton in check from a quality standpoint, some SPC methods need to be put into place. There are none today other than occassion gage R&R for Attibute Analysis which I put into place Quarterly. So my quesiton is more me trying to have a broader understanding of the nature of the data itself in order to convince the SMT that putting SPC in place is a good idea for both locations. I want them to adopt the use of Control Charting tools at both sites and I beleive I will get a lot of blockers at Senior level of my company (status quo changes are bit tricky here). So I guess I am coming at this a bit strange, but I need to understand the nature of data a bit more. My goal is to get process control into place here. This COmpany I work for has TL9000 but no Quality control methods for changing process or moinitoring techniques in place other than dollars which is usually far too late to know when things have gotten out of control. I think what i need to do is show them their proces is not in control and using control charts requires the data analysis to be Gaussian in nature. However, the Control Charting in Minitab allow any of the data to be converted using Box-Cox. But I need to understand when I should use Box-Cox to transform the data. I am not 100% comfortable with knowing if and when it needs to be converted from X type of data to "Normal" distribution. I think this was the crux of my quesiton.

#### bobdoering

Trusted Information Resource
At this point, all you have provided to us is attribute data. None of the data represents failure over time, as you would have in a capability study. That variation is what you want to see.

But, in my mind, SPC with the attributes you identified will only be a report card chart - it will represent that "whatever you did" works or not. It is one use...but one of the weakest uses of control charts.

I suggest using this pareto chart to pick out your worst case. Then determine what causes allow the failure to happen. Are any of the causes variable data? Can they be controlled (dialed in?). If so, do a capability study of that variable, determine its distribution, develop an appropriate charting methodology. Then...go on to the next one.

Using that approach, it will be much more easy to convince management of the value of SPC (at least it would impress me a whole lot more).

#### Bev D

##### Heretical Statistician
Super Moderator
Also, if you are going to place each defect type on a control chart (and yes you are ready to go do that) you will not have to 'transform' the data so it is Normal. just use the appropriate chart type for your data. in this case I'd start by trying a u chart. you may need or want to go to an I, MR chart, but start with the u. Control charts do not require Normality (Gaussian data) to work. That is a myth. Control charts do require rational subgrouping and a most chart types require a homogenous process stream, although some charts do incorporate adjustments for systemic causes that are not removable.

Try plotting the run charts of some of your defects and we can take a look to help you ensure that you have the correct charts.

Last edited by a moderator:
Data normality versus capability Capability, Accuracy and Stability - Processes, Machines, etc. 11
Transformation of Data Normality Failed Using Minitab Software 11
J What can I do to give the Pp and Cp when data is Non normality data Quality Tools, Improvement and Analysis 11
C Data Normality using Anderson Darling (Template) Document Control Systems, Procedures, Forms and Templates 1
M Minitab 16 doesn't like perfect data for Cpk & Normality? Using Minitab Software 5
R Testing Data for Normality - Need help on my essay Statistical Analysis Tools, Techniques and SPC 9
D Normal Distribution - How do I test for normality? All the data or just the averages? Statistical Analysis Tools, Techniques and SPC 8
O Chi Square Normality Test using Microsoft Excel Excel .xls Spreadsheet Templates and Tools 2
J Checking Variance/Normality of my two Samples (question) Using Minitab Software 2
Normality, Tail Probabilities, and SPC Statistical Analysis Tools, Techniques and SPC 3
S Which Normality Test more acceptable to FDA; Also, Non-Normal Threshold? Qualification and Validation (including 21 CFR Part 11) 5
S Normality Test Chart Title Information in Minitab 15 - Label Modifications Using Minitab Software 2
E Run Chart in Minitab - Can I use Run Chart for Normality Test in Minitab Statistical Analysis Tools, Techniques and SPC 6
William A. Levinson speaks on When the Bell Curve Doesn?t Fit, Part 2: Non-Normality Statistical Analysis Tools, Techniques and SPC 0
M Plain English Explanation of when to use which Normality Tests Quality Tools, Improvement and Analysis 7
B Minitab: Understanding Normality and What Chart To Use. Using Minitab Software 6
L Capability Calculation/Error associated to faulty normality distribution assumption Statistical Analysis Tools, Techniques and SPC 2
K Normality of distribution prior to Cpk and PpK Statistical Analysis Tools, Techniques and SPC 4
S Capabilty indicators in case of non normality of a distribution Statistical Analysis Tools, Techniques and SPC 5
M Macro that will perform a normality test and capability analysis - Minitab help Using Minitab Software 1
S Should we check the normality of measurements when the calculation of Gage RR is done Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 1
D Normality Test Before the Xbar R Diagram is Made? Statistical Analysis Tools, Techniques and SPC 5
K Normality Assumption - Most Tests 'Assume' a Normal Distribution - t test statistic Statistical Analysis Tools, Techniques and SPC 5
Is it a requirement for run charts to have inspection data or can it have just a pass/fail check mark Records and Data - Quality, Legal and Other Evidence 1
Brazil clinical data/trial requirement Other Medical Device Regulations World-Wide 0
Part 145 Maintenance Data Review EASA and JAA Aviation Standards and Requirements 1
Electronic Data Management ISO 17025 related Discussions 1
ISO 14001 Finding - Missing Safety Data Sheets ISO 14001:2015 Specific Discussions 2
Comparing Two Test Variables Using Attribute Data Inspection, Prints (Drawings), Testing, Sampling and Related Topics 0
Before-After Data Analysis Statistical Analysis Tools, Techniques and SPC 1
Gage type and data base maintainence Using GAGEpack Software 2
Sharing Suppliers Performance Data with Supplier. Supplier Quality Assurance and other Supplier Issues 6
Conducting a clinical investigation with clinical data from India EU Medical Device Regulations 3
Data types vs Mathematical operations Six Sigma 4
Gage R&R study - Ordinal data Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 6
Data Bias - Surveys Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 0
Levels of actions and consequence to data integrity US Food and Drug Administration (FDA) 10
ADME data- substances based MD EU Medical Device Regulations 0
Can a software that receive data from a MD be classified as Class I?or is not a MD? EU Medical Device Regulations 5
Process Potential estimation for binary data Capability, Accuracy and Stability - Processes, Machines, etc. 3
Data Storytelling Misc. Quality Assurance and Business Systems Related Topics 4
Summative Evaluation with Post-Market Data? Human Factors and Ergonomics in Engineering 2
EU Data Act Medical Information Technology, Medical Software and Health Informatics 0
Transferring medical data from a device (Sec 201(h)): regulatory implications US Medical Device Regulations 3
Change color or shape of individual data point in control chart Using Minitab Software 6
FDA ECG Data Requirements Medical Information Technology, Medical Software and Health Informatics 3
SQL Server 2019 - Master Data Services - Validation needed? ISO 13485:2016 - Medical Device Quality Management Systems 4
Elaborating a control chart with skewed data Manufacturing and Related Processes 4
In-Process Inspection Raw Data ISO 13485:2016 - Medical Device Quality Management Systems 3
Class III device and shelf life data requirements US Medical Device Regulations 7