Correlation using t-test and correlation coefficient - What is the correct result?

jefnik3201028 · Nov 7, 2007

Hello guys,
Its been a while since I wrote in this forum but I've been reading a lot on those threads.
Well I need the experts help here.
I usually does not encounter this problem but time and again this issue comes up. I have 2 sets of data, same sample measurement, the sample size is 20. Here are the details :
Mean 1=0.1146 Mean2 =0.1185
SD 1=0.011038 SD2 = 0.0636

t-test for dependent sample result showed, p>0.05 which suggest the 2 sets are not significantly different which I can not accept.
Is this result distorted because of the SD problem ? I remember that an f-test should be performed to test equality of variances, do you think that that is the reason for this?

Therefore for this case I ignored the result of the t-test and referred to the linear regression coefficient of r as basis for the correlation.

Need the expert's input if the approach is correct.

Thanks,
Jefnik

Bev D · Nov 8, 2007

Re: Correlation using t-test and correlation coefficient, what is the correct result?

please post the raw data - it will allow us to truly help you.
adn if you can elaborate more on what the data is and what it represents. what question are you trying to answer with this test? why can't you accept the answer...

Geoff Withnell · Nov 8, 2007

Re: Correlation using t-test and correlation coefficient, what is the correct result?

jefnik3201028 said:
Hello guys,
Its been a while since I wrote in this forum but I've been reading a lot on those threads.
Well I need the experts help here.
I usually does not encounter this problem but time and again this issue comes up. I have 2 sets of data, same sample measurement, the sample size is 20. Here are the details :
Mean 1=0.1146 Mean2 =0.1185
SD 1=0.011038 SD2 = 0.0636

t-test for dependent sample result showed, p>0.05 which suggest the 2 sets are not significantly different which I can not accept.
Is this result distorted because of the SD problem ? I remember that an f-test should be performed to test equality of variances, do you think that that is the reason for this?

Therefore for this case I ignored the result of the t-test and referred to the linear regression coefficient of r as basis for the correlation.

Need the expert's input if the approach is correct.

Thanks,
Jefnik

Without seeing the data, just the summary data, I suspect that the problem is that while the means and variances MAY be different, they are not ENOUGH different for a sample of 20 to reliably distinguish between them. I will bet if you plot both samples as histograms, and lay one over the other, you will see that they are very similar. When dealing with two populations such as we have here, with means separated by only a fraction of a standard deviation, a fairly large sample is usually required to show the separation, unless the SDs are very different. Not mathematically rigorous, but it often helps to look at the data visually to diagnose a problem.

Geoff Withnell

Dave Strouse · Nov 8, 2007

Right test?

jefnik3201028,

Please excuse if you have thought this through, but are you sure you are using the right test?

You state that you did the t-test for dependent samples. This is a paired test. It is a restriction on randomization. Perhaps this is appropriate for your data, maybe not.

What is your pairing mechanism?

The test statistic in the independent test is the t-distributed difference in means. The test statistic for the dependent test is the t-distribution of the differences between paired observations. Completely different results may well obtain from the same data tested in both ways.

As suggested earlier, an explanation of what the data is and the data itself would allow folks here to be more helpful.

Steve Prevette · Nov 8, 2007

Re: Correlation using t-test and correlation coefficient - What is the correct result

jefnik3201028 said:
Hello guys,
Its been a while since I wrote in this forum but I've been reading a lot on those threads.
Well I need the experts help here.
I usually does not encounter this problem but time and again this issue comes up. I have 2 sets of data, same sample measurement, the sample size is 20. Here are the details :
Mean 1=0.1146 Mean2 =0.1185
SD 1=0.011038 SD2 = 0.0636

t-test for dependent sample result showed, p>0.05 which suggest the 2 sets are not significantly different which I can not accept.
Is this result distorted because of the SD problem ? I remember that an f-test should be performed to test equality of variances, do you think that that is the reason for this?

Therefore for this case I ignored the result of the t-test and referred to the linear regression coefficient of r as basis for the correlation.

Need the expert's input if the approach is correct.

Thanks,
Jefnik

The t-test is used for comparing two samples and checking if they came from the same population. No time sequence is maintained, each of the two samples are collapsed into their descriptive statistics.

However, linear regression is used for checking if there is a linear relationship between two variables. I assume when you did the linear regression, your x-axis was time.

The difference between the two tests is significant - the t-test samples may not even exist in two distinct time sequences, they may be intermingled (such as on days 1, 5, 7, 8, and 10 one method was used, and on the other days a different method was used). Also, even if the results are in sequence (the first 20 days are the first sample, the second 20 days the second sample), the t-test may show the two sets are different, but there may not be a linear relationship, giving a different result on the linear regression.

Personally, I'd use control charts if you have data in a time sequence.

Bev D · Nov 8, 2007

Re: Right test?

Dave Strouse said:
jefnik3201028,

You state that you did the t-test for dependent samples.

actually he stated that he had "2 sets of data, same sample measurement". he didnt' explicitly say he dependent data although it is a viable interpretation. My first thought was that he was doing a measuremetn repeatability study (for which a paired t test isn't appropriate and certainly a grouped t test not appropriate) but I'm only interpreting vague words.

Hence my and everyone else's request for more information and the raw data.

I guess we have to wait until he wakes up

Dave Strouse · Nov 8, 2007

Re: Correlation using t-test and correlation coefficient - What is the correct result

Bev,

Agree that getting the data is key, but from the OP's note

"t-test for dependent sample result showed, p>0.05 which suggest the 2 sets are not significantly different which I can not accept."

can only be interpreted to mean that he ran the "dependent" or "paired" t-test analysis.

My question was exactly as you stated.."Was the data dependent (paired)? OP does not say how the test was run and data collected, so the dependent analysis may not be right. Or it might. With out understanding what was done, we can only guess.

Dave

Bev D · Nov 8, 2007

Re: Correlation using t-test and correlation coefficient - What is the correct result

yep missed that.

jefnik3201028 · Nov 9, 2007

Re: Correlation using t-test and correlation coefficient - What is the correct result

Hello Guys,
Thanks for all your comments. Here is the data set :

Set A Set B
0.107 0.13
0.111 0.12
0.118 0.11
0.103 0.14
0.11 0.09
0.118 0.1
0.093 0.06
0.104 0.11
0.107 0.07
0.108 0.09
0.113 0.05
0.127 0.04
0.134 0.06
0.11 0.08
0.12 0.08
0.13 0.18
0.108 0.16
0.115 0.26
0.137 0.17
0.119 0.27

When I said same sample, what I actually mean was paired sample. One location was measured by 2 different measuring equipment. The objective was to verify if we can use both equipments and produce the same results so we run a simple linear regression, with r value @ only 0.15 which suggest that there is poor correlation. I suspect the problem here is that the 2 sets have unequal variances.
By the way, the data is about roughness measurements.

Expecting your valuable inputs.

Bev D · Nov 9, 2007

Re: Correlation using t-test and correlation coefficient - What is the correct result

you shouldn't use the paired t test for repeated measurements of the same thing. This is a gage study. BUT this isn't the problem with your data.

as suggested you should always plot your data - that will tell you the answer; the statistical analysis merely confirms what you see. Doing the statistical analyssi without looking at the plot of the data leaves you blind to the answer.

You cannot use your 2 instruments interchangeably - they give you very different results. THe difference in the standard deviation of the 2 data sets is real and critical to understanding that the devices are NOT interchangable. I cannot tell which one is correct.

the attached spreadsheet shows you the data plotted on a square scatter plot. the X axis is the first measurement and the Y axis is the second measurement.

Correlation using t-test and correlation coefficient - What is the correct result?

jefnik3201028

Bev D

Heretical Statistician

Geoff Withnell

Dave Strouse

Steve Prevette

Deming Disciple

Bev D

Heretical Statistician

Dave Strouse

Bev D

Heretical Statistician

jefnik3201028

Bev D

Heretical Statistician

Attachments

Similar threads