# Finding optimum transformation equation for two sets of data

I

#### ilacatd

I want to do a 2-sample t-test to compare the burst pressure performance of two different products that are used in the same application. The data is not very normal: the Anderson-Darling p-value is 0.016 for one group and 0.096 for the other group. There are 30 data points in each group. The data is skewed to the right in both groups (most of the data is on the left side of the histogram, with a pretty long right tail). I would like to transform the data using Box-Cox transformation and Johnson Transformation in Minitab and then see which one has the highest resulting normality p-value and the highest correlation coefficient when plotted. I believe this will result in the most accurate t-test results ? is that right? To do the transformation correctly, it's necessary to use the same transformation equation for both groups in order for the t-test to be valid. If I use different equations I will end up with either diverging or converging data sets and means, and the result will be meaningless. Is there a way in Minitab (or any other tool) to find the optimum equation for two different data sets? You can?t just stack the data in one column, because they are not from the same population ? they come from two different products.

I ran a Box Cox on Group 1 and got an optimum lambda of 0.171, and on Group 2 I got an optimum lambda of 0.231. I need to use only one lambda. So do I use the average of 0.171 and 0.231 = 0.201? The resulting A-D p-values are 0.470 and 0.262, which is a lot better than the raw data, but I?m not sure whether it?s optimal.

With Johnson Transformation, it?s much more complicated to find one optimum equation because there are so many factors in the equation. For Group 1, I got 2.79632 + 1.50848 * Log( ( X - 17.2401 ) / ( 2265.63 - X ) ), and for Group 2 I got -0.0969158 + 1.32128 * Asinh( ( X - 190.918 ) / 121.167 ). The A-D p-values are 0.534 and 0.431 respectively, which is better than the Box-Cox transformation. But I can?t figure out how to find the optimum Johnson Transformation for both groups ? the equations are so different.

Does anyone have any suggestions?

I?m also aware that transformations can be tricky, and that it?s best to use a single type of transformation which fits the type of data in question and then stick with it, rather than finding the ?optimum? equation for every data set. For example, break strength of round cables varies with the square of the diameter of the cable, so SQRT(X) is probably the best transformation equation. Bacteria multiplying in a Petri dish would have exponential growth, so would that be log(X)?, survival time to failure might be Weibull, etc. My test is burst pressure of a vessel, so I?m not sure which transformation is best ? I generally get good results with natural log, but it?s not that clear. Does there need to be a further justification for doing a transformation besides simply to improve normality for purposes of doing a t-test? Any comments on transformations in general and in choosing the most appropriate transformation for an ongoing series of tests would be appreciated.

Thanks.

Ari Goldberg

#### Miner

##### Forum Moderator
The Johnson transform is the method of last resort. If Box-Cox works, use it instead. At n=30, a t-test is somewhat tolerant to non-normality, so using the average lambda should work fine.

You can also try using a nonparametric approach such as the Mann-Whitney test.

I

#### TWA - not the airline

Trusted Information Resource
What is the reason for perfoming the t-test? What kind of decision will be made when you find out that there is a significant difference in the mean burst pressure or what happens if you do not find this difference? This should influence your decisions, e.g. using a nonparametric test would save you from suspicions you may have fudged the results by tweaking the transformation parameters. And depending on your null hypothesis your transformations will have an impact on the alpha- and beta-error...

#### Bev D

##### Heretical Statistician
Super Moderator
I am one those heretics who opposes transformations. usually you can make valid conclusions without transforming your data. I have played with transforming vs not transforming and I've never found a situation where transforming was necessary. too often people reflex to transformation because their data doesn't appear to be Normally distributed. as Miner noted, most statistical tests are fairly robust to deviations form Normality. in addition for data that are simply NOT Normally distributed other techniques are simpler, easier and more compelling.

my other learning over the years is that the real issue isn't the non-Normality of the data but the inherent weakness of the study design given the nature of the process and the real question to be answered.

If you are going to resort to transformation, Miner's advice is correct (as always)

can you post your data? perhaps there is more to learn from the data than a simple t-test

#### TWA - not the airline

Trusted Information Resource
Bev, you're right. One should have an understanding what causes one's data to be non-normal before just applying a transformation and believing in the results that follow. If you do not know what actually happened you should use the data as is...

#### Miner

##### Forum Moderator
The data is skewed to the right in both groups (most of the data is on the left side of the histogram, with a pretty long right tail).
I agree with Bev. I prefer to deal with data in its naturally occurring distribution. I see too many people trying to transform non-Normal data without understanding why it isn't Normal. The reason is usually because the process was not stable, or they had mixed different process streams. In both cases, you should not transform the data, but address the instability and differences between streams. Transforming simply hides this information.

I suspect that you are dealing with a Largest Extreme Value distribution. While Minitab cannot model this particular distribution, it can model a Weibull distribution which can very closely match an LEV.

#### reynald

##### Quite Involved in Discussions
I want to do a 2-sample t-test to compare the burst pressure performance of two different products that are used in the same application. ...
Ari Goldberg

I can't really put it into words but somehow my engineering instincts tell me that I should put more weight on the left side of the distribution than the right side/tails.
This is if I get the picture right: Burst pressure test is more of a safety test right? So I should not be concerned with outliers on the right side (i.e. bursts at very high pressure). I will be more worried to those instances where bursts are at low pressures.
What I'm saying is that even if I can prove that the mean burst pressure are comparable for both products, that does not tell me which one is safer. One could have 25% instances that bursts at let's say 50psi while the other has only 5%. I would rather go for the second one even if the other exhibited higher mean pressure.

#### TWA - not the airline

Trusted Information Resource
reynald, I totally agree, the mean does not say much about the safety. That's why I wanted to know what the OP actually wants to accomplish with the t-test comparison of the means...

#### Miner

##### Forum Moderator
Another reason to use the Mann-Whitney test. It tests for differences in medians. Medians are relatively insensitive to extreme values while means are highly sensitive to extreme values.

Finding Optimum Design Parameters using Taguchi method? Using Minitab Software 2
Appeal of Nonconformance Finding Nonconformance and Corrective Action 40
Question on finding • Failure to follow Work Instructions ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 26
PSUR and main finding of PMCF EU Medical Device Regulations 2
FAI Finding Prevention AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 5
ISO 14001 Finding - Missing Safety Data Sheets ISO 14001:2015 Specific Discussions 2
How to respond to 483 validation finding we disagree with? 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 33
Audit Finding - Measurement of Process - Continuous Improvement - Trend Analysis Oil and Gas Industry Standards and Regulations 22
Finding Equivalent Material Information Manufacturing and Related Processes 1
During internal audit - finding poor action plans ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 18
Need help in determining applicable clause for an audit finding (based on AS9120B) ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 4
Informational Clarification: Finding is NOT synonymous with nonconformity ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 3
Unable to close a finding raised from BSI last year ISO 13485:2016 - Medical Device Quality Management Systems 12
Audit Finding - Design History File (DHF) Index: few (3 to 4) reports not identified ISO 13485:2016 - Medical Device Quality Management Systems 3
Finding Independent License Holder (Medical devices) in Philippines Other Medical Device Regulations World-Wide 1
Stage 1 Finding. Request views and comments ISO 13485:2016 - Medical Device Quality Management Systems 3
External calibration - Finding in our 3rd party audit General Measurement Device and Calibration Topics 58
How to answer ISO9001:2015 audit finding of old revisions of documents being used? Document Control Systems, Procedures, Forms and Templates 8
Help with CAPA for an IATF finding regarding inspection documentation IATF 16949 - Automotive Quality Systems Standard 10
Auditor MDR (Presub audit) finding EU Medical Device Regulations 2
Using R package to implement Bayesian phase I/II dose-finding design for three outcomes ISO 13485:2016 - Medical Device Quality Management Systems 6
ISO 9001/IATF 16949 Audit Finding Question - Document Retention IATF 16949 - Automotive Quality Systems Standard 11
ISO 9001-2015 Internal audit finding Internal Auditing 14
Define timeline for Major and Miner Audit finding General Auditing Discussions 4
Closing a finding before closing meeting General Auditing Discussions 25
Audit Finding For Not Retaining Test Results ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 7
OHSAS 18001 external auditor finding personal interpretation? Occupational Health & Safety Management Standards 5
Received a Major finding during IATF Surveillance audit for loss of BIQS Level 3 (more than 6 SPPS in 6 months)...how should we address SYSTEMIC CA? IATF 16949 - Automotive Quality Systems Standard 11
Finding a flat or not modeling distribution: how to manage it? Capability, Accuracy and Stability - Processes, Machines, etc. 24
Looking for Tips on finding an IATF consultant Consultants and Consulting 6
Customer Audit Finding ISO13485:2016 7.6 ISO 13485:2016 - Medical Device Quality Management Systems 7
PED 2014/68/EU - Is this Traceability audit finding correct? Other ISO and International Standards and European Regulations 4
Design and Development Requirement - MDSAP Audit Finding Other Medical Device Regulations World-Wide 5
R Major nonformance finding was given during a closing meeting of a ISO9001 certification audit General Auditing Discussions 76
Audit finding - Components being transferred inter-plant Internal Auditing 3
Difficulty Finding A Notified Body for CE Marking - No Capacity Registrars and Notified Bodies 5
D AEA (Aerospace Experience Auditor) Challenges - Finding availability of AEA for witness AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 3
IATF 16949 7.1.5.3.2 FAQ #7 Audit Finding - External Calibration Laboratory IATF 16949 - Automotive Quality Systems Standard 7
Major Finding Elevation - Do I elevate this minor finding to a major one? AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 14
GM/VP softgrading internal audit finding - need feedback from an audit guru! General Auditing Discussions 11
Customer Requirements - AS9120B Audit Finding AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 25
IATF 16949 registration - Major Nonconformance Finding IATF 16949 - Automotive Quality Systems Standard 9
Hardness inspection Audit Finding Employee error Nonconformance and Corrective Action 6
How long should I have to correct an audit finding? VDA Standards - Germany's Automotive Standards 7
How to appeal a major audit finding ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 26
T Non-addressed Minor Finding elevated to Major Finding Internal Auditing 37
Definition Would a wrong definition constitute an Audit Finding? Definitions, Acronyms, Abbreviations and Interpretations Listed Alphabetically 3
Audit Finding Flowchart for Feedback General Auditing Discussions 6
Finding a Fixture Design Course to meet Customer Requirements General Measurement Device and Calibration Topics 3
Audit Finding - CAPA, Improvement Initiatives not filed in CAR System ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 5