# Non-Normal Data F test and T test

#### optomist1

##### A Sea of Statistics
Trusted Information Resource
Good Day To All Covers,

I found a similar thread, although it did not totally answer my question.

I am testing two verification runs of an improved process, part of a Six Sigma project, the two verification runs are 30 pieces each; the First involves machine crimped wire/terminal connections. The Second 30 pieces are the same wire terminal combination, however they are hand crimped.

I am testing the Tensile Strength using a digital force meter; finally the questions, I wish to test for differences in mean between the two samples.
First I check for or assess the distribution of each sample. The first sample is not normal, rather Weibull; the second is normal.

When testing the two samples for equal variance (f-test) it passes both the f-test statistic/p-value and the Levene's for non-normal....therefore it passes test for equal variance.

How does one test for mean differences between one Weibull distribution and a normal distribution?

Is normalcy a concern when running T-tests or ANOVA for matter??

Regards,
Marty

A

Hi Marty,

A simple solution here is to use a Mood's Median test. It functions like an ANOVA but tests for differences in the medians rather than the means. You will probably not lose much power given that one of your samples passed a normal test. If you're using Minnie it's available there.

I'd probably run both Mood's and ANOVA and see what, if any different conclusions you see with Mood's being the more conservative, non-parametric test.

Cheers,

#### optomist1

##### A Sea of Statistics
Trusted Information Resource

Thanks for the insight... I've got Minitab.....I'll give it a whirl.

Regards,
Marty

N

#### NumberCruncher

Hi optomist1

Some thoughts in my usual, verbose style.

1) I am a big fan of visually checking data prior to any analysis.

You state quite clearly that you checked to assess the distribution of each sample. Was this done visually? Did you plot the two distributions on the same graph? I expect that you did, but it's nice to have that confirmed.

The reason for asking is that, if the two distributions do not overlap, there is limited use in doing a formal test. The distributions are clearly different and any useful statistical test will simply tell you what is obvious just by looking at the graphs. No overlap = different. This of course assumes that you have sufficient data.

I assume that you have checked the data visually, and that there is a problematic amount of overlap between the two sets of data, hence the need for a test.

2) How different in shape are the distributions?

You state that one is Normal and one is Weibull. Weibull distributions can take a wide range of shapes from a one sided 'ski jumpl' shape, a highly skewed 'sand dune' shape (not a mathematical description, but I hope you know what I mean!), all the way to a classic bell shaped 'Normal' looking curve.

If your Weibull distribution looks like a Normal curve, but it fits the Weibull distribution a little better, perhaps you don't need to worry about the slight difference in shapes. Many parametric tests are robust to slight departures from Normality. You only have 30 data to define the curve, which is not a huge number. Perhaps if you repeated your comparison experiment, the "Weibull-ness" (!!!) would turn out to be just an artefact of the small sample size.

On the other hand, if you are comparing a bell with a ski jump, I understand only too clearly why you posted your above question.

3) Regarding the requirement for Normal data for t-tests and ANOVA.

Yes, they require Normal data. Both procedures were developed on the assumption of Normal data. The usual get-out-of-jail-free clause has already been stated above. "Many parametric tests are robust to slight departures from Normality." So if your data are only a bit non Normal, you should be ok.

If you need non parametric tests, you have a choice. For an equivalent of the t-test, you have the Wilcoxon (or Mann-Whitney) test. There are also non parametric equivalents for ANOVA. Be aware that non-parametric tests do assume that the underlying distributions are the same, so if you do have wildly different shapes, these tests may not be reliable.

4) When is data not Normal?
Sadly, that is a bit like the oxymoron of "exact uncertainties". There isn't a razor sharp cutoff between Normal and non Normal. All you get is increasingly unreliable statistical tests.

If you can, get more data. If your operator and your crimping machine are repeatable, you can combine the two sets of data to reduce the variance and get a better idea of the shape of the distributions.

4) Irrespective of any formal tests, remember the difference between statistical significance and practical significance.

Is the difference worth any investment in new equipment? This is a financial question, but very important. If you only have a slight difference in the reliability of your final product, is it worth the cost of the new equipment?

Hope this helps

NC

#### optomist1

##### A Sea of Statistics
Trusted Information Resource
Hi NC,

In my opinion, in statistics and the assurance sciences, there is no such thing as too much discussion. Better to have and not need than need and not have; especially when given tools such as Minitab, Excel etc.

as a matter of procedure or training I have completed the following w/Minitab:
1) run charts for each set of data
2) Distribution ID plots for both sets of data, Wiebull, Normal, Lognormal,Expotential......; one set is normal the second set is Weibull with normal second.
3) Box plots of the same
4) Test for equal variances; F - test p > 0.05 Levene's Test P = 0.432
5) Two Sample T-test p = 0.038....different means
6) Mood's Median Test, p =0.10
7) capability analysis one sided (min spec 13) both processes or data sets reveal good Cpk and Ppk > 1.5
and so on...

As a side note when graphically evaluating f- tests and ANOVA (one way) (alpha set to 0.05), if one cannot connect the two confidence intervals with a vertical line, this (as well as a p value <0.05) this suggests that there is data to support that the two means (or the two variances) are not equal....is this graphical test used in industry?

Regards,
Marty

#### optomist1

##### A Sea of Statistics
Trusted Information Resource
Hi NC,

I missed one of the last parts of your response...which puts it all into perspective......although we would like to have clear and clean lines de-marking normal and non-normal etc...it is a grey region. And with minitab, like most software products, one never really uses all the capabilities; Box Cox, johnson transformations etc......at least until forced to use them.

I discovered the non-normal data help section of Minitab, very helpful indeed. There are no universal rules or methods (in minitab) that can be applied to all facets of statistical analysis.

Thank you again for your insightful assistance....this site is vast and very useful...and most important, the quality of the posters and their help.

Regards,
Marty

#### Bev D

##### Heretical Statistician
Staff member
Super Moderator
one other thing to ponder is that for analaytic studies (which are intended be predictive of future performance such as you describe) replication is far more compelling than statistical tests of significance. In fact replicated differences (across multiple changes in all other factors) make statistical tests irrelevant. Too often Six Sigma proponents substitute use of statistical software and p values for thinking.

You may in fact find a statistically significant difference between your two processes but a sample of questions that will come to *my* mind are:
is the difference seen of any practical importance?
were the samples within each subgroup (manual and automatic) independent?
were other factors changed between the subgroups that could account for any difference seen rather than the factor under test?

The study design itself requires more statistical structural rigor than the mathematical rigor of the analysis. A wise man once said: if your experimental results need a statistician you need a better experiment...

#### optomist1

##### A Sea of Statistics
Trusted Information Resource
Hi Bev,

Thanks for your response and guidance.....attached is my excel file. The first page is the data as taken, the second page has the one outlier data point I removed and replaced with a Median value data point 32.00. The process I am examining has very limited data history, settings etc.

Thanks again...

Regards,
Marty

#### bobdoering

Trusted Information Resource
I am testing two verification runs of an improved process, part of a Six Sigma project, the two verification runs are 30 pieces each; the First involves machine crimped wire/terminal connections. The Second 30 pieces are the same wire terminal combination, however they are hand crimped.

I am testing the Tensile Strength using a digital force meter...
My experience has been for critical wire crimping evaluation, tensile strength is a bad test method to begin with. We generally use sectioning to verify a "good" crimp (such as no noise, etc.) Part of the reason is that you can have a crimp that functions badly, and still have good tensile strength (that really shows up upon cross sectioning.) I would be surprised if you have not come across that problem. So, before making decisions based on your data, you might consider whether your measure is has adequate resolution. That - with tensile - is not easy due to its destructive nature.

Most incorrect conclusions arise from pristinely applied statistics to badly sampled (or measured) data.

#### optomist1

##### A Sea of Statistics
Trusted Information Resource
Hi Bob,

Thank you very much for your insight.....I have toyed with the prospect of sectioning and examining a sample of crimped wire/terminal connections.

My rationale (cost aside) for resisting thus far, is that in subsequent operations we perform electrical continuity checks......the combination of the two test/checks should be sufficient. In addition, the harness end use is to support static laboratory testing....the see no vibration, temp extremes etc.

I fully support your assertion, as one can pass a tensile test and later on fail because of a poorly constructed crimp, insulation in the brush crimp area etc. As I type, I will in fact strongly suggest that we perform some form of periodic sampling/sectioning....less frequently than the tensile test that is performed at the beginning of each day/lot/batch and at the approximated midday point.

Regards,
Marty

Ppk results shown as asterisk after the transformation of Non-normal data Using Minitab Software 4
L How to evaluate the process capability of a data set that is non-normal (cannot be transformed and does not fit any known distribution)? Capability, Accuracy and Stability - Processes, Machines, etc. 12
R Non Normal Data in a historically normal process Capability, Accuracy and Stability - Processes, Machines, etc. 6
Y Process Capability for Non-Normal Data - Philosophical Questions Capability, Accuracy and Stability - Processes, Machines, etc. 6
P Non-normal Data Cpk Statistical Analysis Tools, Techniques and SPC 5
S Non-Normal Data - Measurement for "straightness" with a 0.001" max tolerance Capability, Accuracy and Stability - Processes, Machines, etc. 10
D Calculating Cpk on Non-Normal Data Distribution Capability, Accuracy and Stability - Processes, Machines, etc. 10
J Non-Normal Distribution Data - Tolerance Intervals and Minitab Using Minitab Software 7
M t-test with Non Normal Data Statistical Analysis Tools, Techniques and SPC 16
K Non-Normal Data Analysis Literature, Websites, Books for Learning Quality Tools, Improvement and Analysis 2
R Transforming or not Transforming - Dealing with Non-Normal Data Statistical Analysis Tools, Techniques and SPC 10
M Non-Normal Data & Minitab Statistical Analysis Tools, Techniques and SPC 17
M Non Normal Data Managing - Clements Method to get Process Characteristics Statistical Analysis Tools, Techniques and SPC 8
M How to Do Ppk on Non-Normal Data Capability, Accuracy and Stability - Processes, Machines, etc. 10
T Non-normal data for Cp Cpk indices Statistical Analysis Tools, Techniques and SPC 1
U Calculate Z-Score for Non-Normal Data - I have two set of data Statistical Analysis Tools, Techniques and SPC 9
S Capability Analysis of Non-Normal Data Statistical Analysis Tools, Techniques and SPC 23
T Cp and Cpk calculation for Non-normal data? Statistical Analysis Tools, Techniques and SPC 11
T Capability with Non-Normal Data - Component Strength Statistical Analysis Tools, Techniques and SPC 11
Y Perform 2 ways ANOVA in minitab with non normal data involved Using Minitab Software 5
B How to identify whether my data is non-normal? Statistical Analysis Tools, Techniques and SPC 26
Quality control with non-normal, censored and truncated data examples needed Statistical Analysis Tools, Techniques and SPC 0
Use of Non-Parametric Statistics and Non-Normal Data Statistical Analysis Tools, Techniques and SPC 33
N Analysis of non-normal stratified data for cpk/ppk? Rupture test Capability, Accuracy and Stability - Processes, Machines, etc. 6
R P-value is 0.05 - Normal or non-normal data? Capability, Accuracy and Stability - Processes, Machines, etc. 31
L How to calculate tolerance intervals for non-normal data? Reliability Analysis - Predictions, Testing and Standards 13
J Is a t-test used in non-normal data analysis? Six Sigma 7
C Analyzing Data with a Non-Normal distribution Statistical Analysis Tools, Techniques and SPC 41
A Capability Analysis - Dealing with non-normal data in Minitab Using Minitab Software 8
V MINITAB and non-normal unilateral tolerance data - Cannot confirm Ppk values Using Minitab Software 15
J Non-Normal Data - Transforming the Data to Normal Statistical Analysis Tools, Techniques and SPC 12
R Bootstraping and Resampling Statistics - Capability for non-normal variable data Inspection, Prints (Drawings), Testing, Sampling and Related Topics 6
A Non-normal Distributions in SPC - How do I Normalize Data? Statistical Analysis Tools, Techniques and SPC 55
Apply control limits to a non-normal distribution Statistical Analysis Tools, Techniques and SPC 13
Non-normal Distribution Selection where the system is constantly being corrected Capability, Accuracy and Stability - Processes, Machines, etc. 11
Process Capability for parameters with non-normal distribution. Capability, Accuracy and Stability - Processes, Machines, etc. 16
M Is it possible to get Natural Tolerance (Tn) with Non Normal Distribution? Statistical Analysis Tools, Techniques and SPC 9
Test of Means or Medians for Non Normal Populations? Using Minitab Software 6
S Which Normality Test more acceptable to FDA; Also, Non-Normal Threshold? Qualification and Validation (including 21 CFR Part 11) 5
S How you define 'Normal & Abnormal' Conditions and 'Routine & Non-routine' Activities ISO 14001:2015 Specific Discussions 8
A Finding Control Limits for a Non-Normal Distribution Statistical Analysis Tools, Techniques and SPC 3
R Torque Confidence Testing (Non-normal Distribution) Statistical Analysis Tools, Techniques and SPC 5
F Non-Normal Distribution vs. Gamma Distribution Statistical Analysis Tools, Techniques and SPC 18
F Process Capability for Non-Normal Process Statistical Analysis Tools, Techniques and SPC 5
C Designing control chart for non-normal variables Statistical Analysis Tools, Techniques and SPC 3
M Cp, Cpk, Dpm, or other? Diameter of a Hole - Non-Normal Distribution Capability, Accuracy and Stability - Processes, Machines, etc. 20
A ANSI Z1.4 or Z1.9 for non-normal distributions? Inspection, Prints (Drawings), Testing, Sampling and Related Topics 5
D Ppk of non-normal bounded proccess, one sided specification Capability, Accuracy and Stability - Processes, Machines, etc. 3
J Capability of Inherently Non-Normal Process - Plating Process Thickness Distribution Statistical Analysis Tools, Techniques and SPC 9
D Control charts for non-normal distributions - Do I need to do anything special? Statistical Analysis Tools, Techniques and SPC 5