# Non-Normal Data F test and T test

#### optomist1

##### A Sea of Statistics
Super Moderator
Good Day To All Covers,

I found a similar thread, although it did not totally answer my question.

I am testing two verification runs of an improved process, part of a Six Sigma project, the two verification runs are 30 pieces each; the First involves machine crimped wire/terminal connections. The Second 30 pieces are the same wire terminal combination, however they are hand crimped.

I am testing the Tensile Strength using a digital force meter; finally the questions, I wish to test for differences in mean between the two samples.
First I check for or assess the distribution of each sample. The first sample is not normal, rather Weibull; the second is normal.

When testing the two samples for equal variance (f-test) it passes both the f-test statistic/p-value and the Levene's for non-normal....therefore it passes test for equal variance.

How does one test for mean differences between one Weibull distribution and a normal distribution?

Is normalcy a concern when running T-tests or ANOVA for matter??

Regards,
Marty

A

Hi Marty,

A simple solution here is to use a Mood's Median test. It functions like an ANOVA but tests for differences in the medians rather than the means. You will probably not lose much power given that one of your samples passed a normal test. If you're using Minnie it's available there.

I'd probably run both Mood's and ANOVA and see what, if any different conclusions you see with Mood's being the more conservative, non-parametric test.

Cheers,

#### optomist1

##### A Sea of Statistics
Super Moderator

Thanks for the insight... I've got Minitab.....I'll give it a whirl.

Regards,
Marty

N

#### NumberCruncher

Hi optomist1

Some thoughts in my usual, verbose style.

1) I am a big fan of visually checking data prior to any analysis.

You state quite clearly that you checked to assess the distribution of each sample. Was this done visually? Did you plot the two distributions on the same graph? I expect that you did, but it's nice to have that confirmed.

The reason for asking is that, if the two distributions do not overlap, there is limited use in doing a formal test. The distributions are clearly different and any useful statistical test will simply tell you what is obvious just by looking at the graphs. No overlap = different. This of course assumes that you have sufficient data.

I assume that you have checked the data visually, and that there is a problematic amount of overlap between the two sets of data, hence the need for a test.

2) How different in shape are the distributions?

You state that one is Normal and one is Weibull. Weibull distributions can take a wide range of shapes from a one sided 'ski jumpl' shape, a highly skewed 'sand dune' shape (not a mathematical description, but I hope you know what I mean!), all the way to a classic bell shaped 'Normal' looking curve.

If your Weibull distribution looks like a Normal curve, but it fits the Weibull distribution a little better, perhaps you don't need to worry about the slight difference in shapes. Many parametric tests are robust to slight departures from Normality. You only have 30 data to define the curve, which is not a huge number. Perhaps if you repeated your comparison experiment, the "Weibull-ness" (!!!) would turn out to be just an artefact of the small sample size.

On the other hand, if you are comparing a bell with a ski jump, I understand only too clearly why you posted your above question.

3) Regarding the requirement for Normal data for t-tests and ANOVA.

Yes, they require Normal data. Both procedures were developed on the assumption of Normal data. The usual get-out-of-jail-free clause has already been stated above. "Many parametric tests are robust to slight departures from Normality." So if your data are only a bit non Normal, you should be ok.

If you need non parametric tests, you have a choice. For an equivalent of the t-test, you have the Wilcoxon (or Mann-Whitney) test. There are also non parametric equivalents for ANOVA. Be aware that non-parametric tests do assume that the underlying distributions are the same, so if you do have wildly different shapes, these tests may not be reliable.

4) When is data not Normal?
Sadly, that is a bit like the oxymoron of "exact uncertainties". There isn't a razor sharp cutoff between Normal and non Normal. All you get is increasingly unreliable statistical tests.

If you can, get more data. If your operator and your crimping machine are repeatable, you can combine the two sets of data to reduce the variance and get a better idea of the shape of the distributions.

4) Irrespective of any formal tests, remember the difference between statistical significance and practical significance.

Is the difference worth any investment in new equipment? This is a financial question, but very important. If you only have a slight difference in the reliability of your final product, is it worth the cost of the new equipment?

Hope this helps

NC

#### optomist1

##### A Sea of Statistics
Super Moderator
Hi NC,

In my opinion, in statistics and the assurance sciences, there is no such thing as too much discussion. Better to have and not need than need and not have; especially when given tools such as Minitab, Excel etc.

as a matter of procedure or training I have completed the following w/Minitab:
1) run charts for each set of data
2) Distribution ID plots for both sets of data, Wiebull, Normal, Lognormal,Expotential......; one set is normal the second set is Weibull with normal second.
3) Box plots of the same
4) Test for equal variances; F - test p > 0.05 Levene's Test P = 0.432
5) Two Sample T-test p = 0.038....different means
6) Mood's Median Test, p =0.10
7) capability analysis one sided (min spec 13) both processes or data sets reveal good Cpk and Ppk > 1.5
and so on...

As a side note when graphically evaluating f- tests and ANOVA (one way) (alpha set to 0.05), if one cannot connect the two confidence intervals with a vertical line, this (as well as a p value <0.05) this suggests that there is data to support that the two means (or the two variances) are not equal....is this graphical test used in industry?

Regards,
Marty

#### optomist1

##### A Sea of Statistics
Super Moderator
Hi NC,

I missed one of the last parts of your response...which puts it all into perspective......although we would like to have clear and clean lines de-marking normal and non-normal etc...it is a grey region. And with minitab, like most software products, one never really uses all the capabilities; Box Cox, johnson transformations etc......at least until forced to use them.

I discovered the non-normal data help section of Minitab, very helpful indeed. There are no universal rules or methods (in minitab) that can be applied to all facets of statistical analysis.

Thank you again for your insightful assistance....this site is vast and very useful...and most important, the quality of the posters and their help.

Regards,
Marty

#### Bev D

##### Heretical Statistician
Super Moderator
one other thing to ponder is that for analaytic studies (which are intended be predictive of future performance such as you describe) replication is far more compelling than statistical tests of significance. In fact replicated differences (across multiple changes in all other factors) make statistical tests irrelevant. Too often Six Sigma proponents substitute use of statistical software and p values for thinking.

You may in fact find a statistically significant difference between your two processes but a sample of questions that will come to *my* mind are:
is the difference seen of any practical importance?
were the samples within each subgroup (manual and automatic) independent?
were other factors changed between the subgroups that could account for any difference seen rather than the factor under test?

The study design itself requires more statistical structural rigor than the mathematical rigor of the analysis. A wise man once said: if your experimental results need a statistician you need a better experiment...

#### optomist1

##### A Sea of Statistics
Super Moderator
Hi Bev,

Thanks for your response and guidance.....attached is my excel file. The first page is the data as taken, the second page has the one outlier data point I removed and replaced with a Median value data point 32.00. The process I am examining has very limited data history, settings etc.

Thanks again...

Regards,
Marty

#### bobdoering

Trusted Information Resource
I am testing two verification runs of an improved process, part of a Six Sigma project, the two verification runs are 30 pieces each; the First involves machine crimped wire/terminal connections. The Second 30 pieces are the same wire terminal combination, however they are hand crimped.

I am testing the Tensile Strength using a digital force meter...

My experience has been for critical wire crimping evaluation, tensile strength is a bad test method to begin with. We generally use sectioning to verify a "good" crimp (such as no noise, etc.) Part of the reason is that you can have a crimp that functions badly, and still have good tensile strength (that really shows up upon cross sectioning.) I would be surprised if you have not come across that problem. So, before making decisions based on your data, you might consider whether your measure is has adequate resolution. That - with tensile - is not easy due to its destructive nature.

Most incorrect conclusions arise from pristinely applied statistics to badly sampled (or measured) data.

#### optomist1

##### A Sea of Statistics
Super Moderator
Hi Bob,

Thank you very much for your insight.....I have toyed with the prospect of sectioning and examining a sample of crimped wire/terminal connections.

My rationale (cost aside) for resisting thus far, is that in subsequent operations we perform electrical continuity checks......the combination of the two test/checks should be sufficient. In addition, the harness end use is to support static laboratory testing....the see no vibration, temp extremes etc.

I fully support your assertion, as one can pass a tensile test and later on fail because of a poorly constructed crimp, insulation in the brush crimp area etc. As I type, I will in fact strongly suggest that we perform some form of periodic sampling/sectioning....less frequently than the tensile test that is performed at the beginning of each day/lot/batch and at the approximated midday point.

Regards,
Marty