How to identify whether my data is non-normal?

B

bkarthikeyan

When using SPC how to identify whether my data follows normal or non-nroaml ( without using minitab or softwares). Can we use Histogram to find these?? If so any min. no samples to be taken??

Staff member
Well, hello there! The easiest thing I would do is put the data into Excel and graph it. At least you can visually make a reasonable assertion as to the normality of the data.

Always take as much data as possible. The more the better.

If you don't mind me asking, why are you asking this? What situation are you attempting to remedy?

What kind of process are you trying to establish?

I'm just interested as to your question regarding how much data to determine normality.

B

bkarthikeyan

I was explaining SPC concepts to one of the fresher to my org. During discussion this question came. I know that through Minitab we can check whether data is normal or not . If we dont have Minitab then how to find out ?? In our factory our Mfg process incudes Stamping, Painting, Assembly so I want to know how much data needed because each process output rate varies.

Stijloor

Staff member
Super Moderator

I was explaining SPC concepts to one of the fresher to my org. During discussion this question came. I know that through Minitab we can check whether data is normal or not . If we dont have Minitab then how to find out ?? In our factory our Mfg process incudes Stamping, Painting, Assembly so I want to know how much data needed because each process output rate varies.
Hello bkarthikeyan,

The simplest way to check for normality is to develop a histogram based on a 100 (or more) piece random sample and visually assess if the data are normally distributed. I do not know how familiar you are with developing a histogram...here is some information on how to do this. Hope this helps.

http://www.buhs.k12.vt.us/science/physicalscience/histograms/histogram_tutorial_page_4.html

Steve Prevette

Deming Disciple
Staff member
Super Moderator
There are a number of statistical tests for normality. Some of the easiest are based upon if the skewness and kurtosis are the proper value for the normal distribution.

The most common test (and can be implemented with a little excel programming) is a chi-square test on the distribution of the data into histogram bins as compared to what the normal distribution would predict for the bins.

Staff member

I was explaining SPC concepts to one of the fresher to my org. During discussion this question came. I know that through Minitab we can check whether data is normal or not . If we dont have Minitab then how to find out ?? In our factory our Mfg process incudes Stamping, Painting, Assembly so I want to know how much data needed because each process output rate varies.
Superb posts from Steve and Stijloor (as usual). Forgive me if I keep coming back to the same issue, but why limit how much data you will obtain? Is it cost prohibitive?

If you only collect 15 data points, you will need to be concerned with normality. If you collect 1500 data points, normality will virtually be a given. This is why it's important to try to figure out how much data you're looking at, and whether normality is an issue. Also, depending on how many variables per sample you are trying to assess, the power of your inferences may become very low if you don't have enough data.

As mentioned by the others, even though your post specifically said you wanted to keep away from a software package for a decision, most decent software packages have statistical tools to estimate normality.

Here is a link on a thread where we had a good discussion on non-parametric statistics, if you're interested:

Non-Parametric statistics

Steve Prevette

Deming Disciple
Staff member
Super Moderator
If you collect 1500 data points, normality will virtually be a given.
One comment - this statement is only true if you are dealing with the average of the 1500 values. If you are dealing with the "tails" of the distribution (such as trying to predict failure) normality is definitely not a "given". I think this is what fooled the developers of six sigma into thinking there was a 1.5 sigma shift.

There is a good book out there by Nassim Taleb called The Black Swan. It's an interesting book, written in plain language. Tom Peters has been highly supportive of his works. The Black Swan does show how we are often fooled by assuming normality, and by our reactions to rare events.

Stijloor

Staff member
Super Moderator
There are a number of statistical tests for normality. Some of the easiest are based upon if the skewness and kurtosis are the proper value for the normal distribution.

The most common test (and can be implemented with a little excel programming) is a chi-square test on the distribution of the data into histogram bins as compared to what the normal distribution would predict for the bins.
Hello Steve,

Great suggestion, but the poster, bkarthikeyan, indicated that it needed to be determined without Minitab or other software. I assume that they may not have access to these resources. So it's back to basics....

You know? Now I'm thinking of it, that's what the (SPC) Masters did...

Stijloor.

Staff member
One comment - this statement is only true if you are dealing with the average of the 1500 values. If you are dealing with the "tails" of the distribution (such as trying to predict failure) normality is definitely not a "given". I think this is what fooled the developers of six sigma into thinking there was a 1.5 sigma shift.

There is a good book out there by Nassim Taleb called The Black Swan. It's an interesting book, written in plain language. Tom Peters has been highly supportive of his works. The Black Swan does show how we are often fooled by assuming normality, and by our reactions to rare events.
Good point. I purposely did not state the Central Limit Theorem here, and it is the average of them that approaches normality, as you correctly pointed out.

However, if I am measuring the paint thickness and I have 1500 data points, that data will most assuredly represent a normal distribution. Over time (as you stated) it might become apparent that the distribution changes to more accurately represent the entire population.

Since the OP mentioned SPC, your statement is highly valid. 1500 data points in July may represent the upper end of the year's distribution.

P.S. If I tell you I still believe in the 1.5 shift, will you still respect me in the morning?

Staff member
Hello Steve,

Great suggestion, but the poster, bkarthikeyan, indicated that it needed to be determined without Minitab or other software. I assume that they may not have access to these resources. So it's back to basics....

You know? Now I'm thinking of it, that's what the (SPC) Masters did...

Stijloor.
Yes, they did get back to the basics. Nice point. So I'm suggesting if there is a small amount of data, just graph it (with whatever tools you have... does anybody still have graph paper?) and you can see if it's reasonably close enough to assume normality. If it's highly skewed, that is valuable to know, and you then do something different.

Identify Medical Device characterstics as Annex C of ISO 14971 Risk Management ISO 14971 - Medical Device Risk Management 5
ISO 9001 8.5.2. - Identification and traceability to Identify Outputs - Services ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 2
When do we identify Residual Risk? Risk Management Principles and Generic Guidelines 11
How to Identify the Risks and Opportunities required for QMS Processes? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 10
How to identify and confirm that the developed device fall under Israel electro medical device category Other Medical Device Regulations World-Wide 0
How to identify and confirm that the developed device fall under Israel electro medical device category Other Medical Device Regulations World-Wide 1
How do you identify what standards a country recognizes outside of FDA, EU, Health Canada Other Medical Device Related Standards 1
Can you identify this moth? 19 January 2019 After Work and Weekend Discussion Topics 6
D Help identify a Nationally Recognized Testing Laboratory (NRTL) (UL) certified lab General Measurement Device and Calibration Topics 3
T To Identify the Applicable MDD Directive - Prepared blood smear EU Medical Device Regulations 13
Requirement to Identify Changes to record in ISO 13485 : 2016 ISO 13485:2016 - Medical Device Quality Management Systems 4
AS9100D Cl. 8.4.2 - Identify Raw Material as a Significant Operational Risk AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 4
C How to Identify Counterfeit Medications (drugs)? US Food and Drug Administration (FDA) 5
How to identify the Management Representative ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 16
How to identify requirements for 'Incoming inspection' Inspection, Prints (Drawings), Testing, Sampling and Related Topics 5
M How to identify software configuration items in a BOM Quality Manager and Management Related Issues 3
S Resource Planning/How to identify IATF 16949 - Automotive Quality Systems Standard 1
V How to identify Customer Specific Requirement If is not provided by Customer Customer and Company Specific Requirements 5
B How to identify Six Sigma Yellow Belt Project Six Sigma 4
C Must we identify steps taken to identify the Root Cause of a failure Nonconformance and Corrective Action 15
G How to identify Key Characteristics (KC) in a Design FMEA (DFMEA) FMEA and Control Plans 2
X Existing Toolroom Process Validation - Need to identify clauses addressed. Manufacturing and Related Processes 7
C How Can I Identify 304 Stainless Steel? Manufacturing and Related Processes 6
How to identify Stakeholders in a Company Quality Tools, Improvement and Analysis 9
M How to identify CTQ / Critical Characteristics using the DFMEA approach. FMEA and Control Plans 3
R 3rd Party Audit Comment - Identify ISO Clauses/Sub Clauses to each Process Quality Management System (QMS) Manuals 45
J AS9100:C Risk Management - Identify the Risk for the Sales/Contract Processes AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 12
T How to Identify Taper Gages without potentially damaging them General Measurement Device and Calibration Topics 5
N How to Number (Identify) and Index Forms Document Control Systems, Procedures, Forms and Templates 1
A DoC for Software Product - How can I identify the specific units that are covered? EU Medical Device Regulations 6
T How to Identify "Observation" in Internal Audits? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 26
X Audit Findings - The Process/Clause Matrix does not identify all the processes ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 30
S Identify Environment Aspect by Activities Approach ISO 14001:2015 Specific Discussions 4
T Can't identify my company's "Key Process" Process Maps, Process Mapping and Turtle Diagrams 27
Effectiveness of 200% Visual Inspection to Identify Defects and Defectives Inspection, Prints (Drawings), Testing, Sampling and Related Topics 32
N Definition IDENTIFY and DETERMINE - What is the technical difference between the words Definitions, Acronyms, Abbreviations and Interpretations Listed Alphabetically 64
A Procedure to Identify Potential Emergency Situations and Accidents Miscellaneous Environmental Standards and EMS Related Discussions 4
AS9102 - How to title the attached form and how to identify the characteristics AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 4
How to Identify Critical to Quality Characteristics (CTQ) FMEA and Control Plans 2
At what point do you need to identify a U.S. Agent - 510(k) 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 3
M New to AS9100 - Clause 7.5.3 - Do you have to identify each part produced AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 6
S Form to identify status of product on the production floor. IATF 16949 - Automotive Quality Systems Standard 5
L Special Characteristics - What if the customer does not identify any SCs? FMEA and Control Plans 5
M How do you identify the wrong orientation in complex wiring harness? Manufacturing and Related Processes 21
W How do you address clause 4.1 General requirements - Identify the processes ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 21
S CE Mark using Registered Trademark to identify the legal manufacturer EU Medical Device Regulations 6
B Determine vs. identify (as in clause 4.1 a), any differences? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 7
B How to identify the processes needed for quality management system ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 16
I Skill Matrix Format as a Tool to Identify Training Needs Document Control Systems, Procedures, Forms and Templates 1