Search the Elsmar Cove!
**Search ALL of Elsmar.com** with DuckDuckGo including content not in the forum - Search results with No ads.
Profound Statistical Concepts

Profound Statistical Concepts 2019-10-31

Watchcat

Quite Involved in Discussions
#51
Thanks, Steve. This is quite fascinating, not at all what I was expecting. It's a little hard to read on my screen, so for others:

The P-value method of testing hypotheses has received widespread acceptance in the research community, but the editors of the journal Basic and Applied Social Psychology took a dramatic stance when they said that they would not longer publish articles that included P-values. In an editorial, David Trafimow and Michael Marks stated their belief that “the P-value bar is too easy to pass and sometimes serves as an excuse for lower quality research.” David Traifmow stated that he did not know which statistical method should replace the use of P-values.
Many reactions to the P-value ban acknowledged that although P-values can be misused and misinterpreted, their use as a valuable research tool remains.


Apparently this was in early 2015. It seems they also banned confidence intervals.
 

Bev D

Heretical Statistician
Staff member
Super Moderator
#52
So if you are trying to understand what the alternatives to p values and the null hypothesis thing and you believe in being the change, let’s change the conversation to the alternatives rather than to fruitless discussions about whether or not scientific and medical journals are moving away from p values.

Deming’s “On Probability as the Basis for Action” is the best place to start your research.

What I do is explained in the resource that started this thread. But I will provide a brief explanation. As Deming proposed I use well developed study designs that are based on theoretical and empirical knowledge of the underlying physics and geometry of the system. Then I use probability to determine the actual confidence that the results and conclusions are sound.

One of the two probability tests I use is based on John Tukey’s paper: “A Quick, Compact, Two Sample Test to Duckworth’s Specifications” published in Technometrics, Vol. 1, February 1959. Basically you get 3 independent replicates (a tad redundant I know but this part is essential) created for each of 2 levels. These levels can be any type of levels: current and new, high and low, supplier 1 and supplier 2, formulation 1 and formulation 2, equipment 1 and equipment 2…Combination math says that if the 3 replicates at level 1 is higher than all 3 replicates at level 2, then there is only a 5% chance of this occurring simply by chance. Increasing the number of replicates will reduce the alpha risk. And most importantly I GRAPH the results. This provides visual evidence of the size of the difference and the probability of the difference – or ‘equivalence’. These replicates can be individual things are subgroups of things.

The other test I use is also based on probability. It is used on a matched pair test and is explained in the resource.

I hope this helps you further your understanding of alternatives.
 

Watchcat

Quite Involved in Discussions
#53
It does seem that everything that can be said about p-values has been said. It also seems that a discussion of the alternatives would be helpful to anyone who is planning to use statistical tools.

As for me, as I said in my reply to Steve, if I were to have the opportunity to use something better than a p-value (i.e. be the change) I'd be very happy to do so, but that strikes me as unlikely. That's because I don't design studies, I read about them in journals. So far, in the journals I read, p-values still rule. As long as this is the case, not much point in me personally understanding alternatives. I think it would be great if the people who design these studies and those who review them understood the alternatives; I gather that's what the statistical community is working on, and more power (no pun intended) to them.

I'm always interested in gaining a better understanding of (and refreshing my memory on) the pros and cons of p-values, since that's what I see in the journals I read. I'm also always interested in understanding trends (statistical and otherwise) in scientific/medical journals, because those are the journals I read.
 
Last edited:

Ronen E

Problem Solver
Staff member
Super Moderator
#54
I don't know if this is considered "an alternative" but I think it's something worth looking at.

It seems they also banned confidence intervals.
Confidence intervals for what?... If someone "rejects" the concept of confidence intervals as a whole, they'd be rejecting a whole big chunk of statistical inference, I think.
 

Watchcat

Quite Involved in Discussions
#55
Confidence intervals for what?
Don't know. Maybe Bev D knows. There are discussions of this "dramatic" move around the internet, if you are interested in learning more.

As for me, based on Steve's post, now I understand...a lot, so now I'm ready to move on. Thanks to everyone for contributing their considerable knowledge and insights to this interesting and enlightening discussion.
 

Bev D

Heretical Statistician
Staff member
Super Moderator
#56
Confidence intervals for what?... If someone "rejects" the concept of confidence intervals as a whole, they'd be rejecting a whole big chunk of statistical inference, I think.
For awhile some were promoting confidence intervals instead of p values. confidence intervals are marginally better - if you graph them - but hey still fall victim to the same misuses the the p value does. Yes, it does indict much of inferential statistics which is exactly Deming's point. probability has use, inferential statistics not so much...

I do use confidence intervals for categorical data in my graphs, but since I always have replication in my studies the overlap of the intervals is irrelevant - I use basic probability theory for determining if a difference exists or it doesn't. since there is really only 1 data point per subgroup for categorical data the confidence intervals serve to display the relative sample sizes of the different subgroups....
 

Ronen E

Problem Solver
Staff member
Super Moderator
#57
For awhile some were promoting confidence intervals instead of p values. confidence intervals are marginally better - if you graph them - but hey still fall victim to the same misuses the the p value does. Yes, it does indict much of inferential statistics which is exactly Deming's point. probability has use, inferential statistics not so much...

I do use confidence intervals for categorical data in my graphs, but since I always have replication in my studies the overlap of the intervals is irrelevant - I use basic probability theory for determining if a difference exists or it doesn't. since there is really only 1 data point per subgroup for categorical data the confidence intervals serve to display the relative sample sizes of the different subgroups....
I start to feel the sea floor dropping beneath my feet, that is, I feel I'm quickly getting out of my depth... I'm not a professional statistician, just an engineer with a keen (practical) interest in statitstics/probability & maths.

I'll just ask a practical question: In their book Statistical Intervals (2nd Ed. 2017), Meeker, Hahn & Escobar argue that Confidence Intervals for quantiles are "similar but better" (my words) than Tolerance Intervals for the same Coverage. Given that you (seemingly) reject the use of Confidence Intervals in general, does that mean you also think Tolerance Intervals are seriously flawed (i.e. similar to p-values)?
 

Bev D

Heretical Statistician
Staff member
Super Moderator
#58
Miner can provide a better expalanation than I can, but I’ll give it a start. Confidence intervals (not to be confused with confidence LEVELS) display the amount of precision of the AVERAGE of a data set. Tolerance intervals display the likely range of some portion of the INDIVIDUAL VALUES of the population. Two very different things. I don’t use tolerance intervals.
 

Miner

Forum Moderator
Staff member
Admin
#59
Miner can provide a better expalanation than I can, but I’ll give it a start. Confidence intervals (not to be confused with confidence LEVELS) display the amount of precision of the AVERAGE of a data set. Tolerance intervals display the likely range of some portion of the INDIVIDUAL VALUES of the population. Two very different things. I don’t use tolerance intervals.
Without going into painful detail, you gave a good summary of the differences. I do not use tolerance interval either. In industries that have stringent requirements, they have very limited application.
 

Miner

Forum Moderator
Staff member
Admin
#60
I have stayed out of the p-value discussion till now, but finally couldn't resist jumping in. I completely agree that p-values have a high potential for misunderstanding and misuse by many people. I believe that much of the misuse arises from the lack of planning and poor experimental design. A key element is to determine in advance the correct sample size to detect a practically significant effect size. When this is not done, we see ridiculous claims of significance to things that have such a small effect size that no one cares (i.e., stop eating "x" and it will add two days to your life expectancy). This casts frequentist statistics into a bad light. However, we must also remember the many successes this approach has had over its inception. Those cannot be discounted simply because people have misused it. Remember that people have also misused Bayesian methods to the point that many courts have barred its use, not because the method doesn't work, but because it is so easily misused by inexperienced practitioners. Both methods have merit when used by people that know what they are doing, and have a great potential for misuse when used by people that do not. This is true of all tools. We frequently see SPC misunderstood and misused in this forum. No statistical method can stand alone, but must be supported by the scientific method and an understanding of the processes involved.
 
Top Bottom