Profound Statistical Concepts

Profound Statistical Concepts 2019-10-31

Watchcat

Trusted Information Resource
#51
Thanks, Steve. This is quite fascinating, not at all what I was expecting. It's a little hard to read on my screen, so for others:

The P-value method of testing hypotheses has received widespread acceptance in the research community, but the editors of the journal Basic and Applied Social Psychology took a dramatic stance when they said that they would not longer publish articles that included P-values. In an editorial, David Trafimow and Michael Marks stated their belief that “the P-value bar is too easy to pass and sometimes serves as an excuse for lower quality research.” David Traifmow stated that he did not know which statistical method should replace the use of P-values.
Many reactions to the P-value ban acknowledged that although P-values can be misused and misinterpreted, their use as a valuable research tool remains.


Apparently this was in early 2015. It seems they also banned confidence intervals.
 
Elsmar Forum Sponsor

Bev D

Heretical Statistician
Staff member
Super Moderator
#52
So if you are trying to understand what the alternatives to p values and the null hypothesis thing and you believe in being the change, let’s change the conversation to the alternatives rather than to fruitless discussions about whether or not scientific and medical journals are moving away from p values.

Deming’s “On Probability as the Basis for Action” is the best place to start your research.

What I do is explained in the resource that started this thread. But I will provide a brief explanation. As Deming proposed I use well developed study designs that are based on theoretical and empirical knowledge of the underlying physics and geometry of the system. Then I use probability to determine the actual confidence that the results and conclusions are sound.

One of the two probability tests I use is based on John Tukey’s paper: “A Quick, Compact, Two Sample Test to Duckworth’s Specifications” published in Technometrics, Vol. 1, February 1959. Basically you get 3 independent replicates (a tad redundant I know but this part is essential) created for each of 2 levels. These levels can be any type of levels: current and new, high and low, supplier 1 and supplier 2, formulation 1 and formulation 2, equipment 1 and equipment 2…Combination math says that if the 3 replicates at level 1 is higher than all 3 replicates at level 2, then there is only a 5% chance of this occurring simply by chance. Increasing the number of replicates will reduce the alpha risk. And most importantly I GRAPH the results. This provides visual evidence of the size of the difference and the probability of the difference – or ‘equivalence’. These replicates can be individual things are subgroups of things.

The other test I use is also based on probability. It is used on a matched pair test and is explained in the resource.

I hope this helps you further your understanding of alternatives.
 

Watchcat

Trusted Information Resource
#53
It does seem that everything that can be said about p-values has been said. It also seems that a discussion of the alternatives would be helpful to anyone who is planning to use statistical tools.

As for me, as I said in my reply to Steve, if I were to have the opportunity to use something better than a p-value (i.e. be the change) I'd be very happy to do so, but that strikes me as unlikely. That's because I don't design studies, I read about them in journals. So far, in the journals I read, p-values still rule. As long as this is the case, not much point in me personally understanding alternatives. I think it would be great if the people who design these studies and those who review them understood the alternatives; I gather that's what the statistical community is working on, and more power (no pun intended) to them.

I'm always interested in gaining a better understanding of (and refreshing my memory on) the pros and cons of p-values, since that's what I see in the journals I read. I'm also always interested in understanding trends (statistical and otherwise) in scientific/medical journals, because those are the journals I read.
 
Last edited:

Ronen E

Problem Solver
Staff member
Moderator
#54
I don't know if this is considered "an alternative" but I think it's something worth looking at.

It seems they also banned confidence intervals.
Confidence intervals for what?... If someone "rejects" the concept of confidence intervals as a whole, they'd be rejecting a whole big chunk of statistical inference, I think.
 

Watchcat

Trusted Information Resource
#55
Confidence intervals for what?
Don't know. Maybe Bev D knows. There are discussions of this "dramatic" move around the internet, if you are interested in learning more.

As for me, based on Steve's post, now I understand...a lot, so now I'm ready to move on. Thanks to everyone for contributing their considerable knowledge and insights to this interesting and enlightening discussion.
 

Bev D

Heretical Statistician
Staff member
Super Moderator
#56
Confidence intervals for what?... If someone "rejects" the concept of confidence intervals as a whole, they'd be rejecting a whole big chunk of statistical inference, I think.
For awhile some were promoting confidence intervals instead of p values. confidence intervals are marginally better - if you graph them - but hey still fall victim to the same misuses the the p value does. Yes, it does indict much of inferential statistics which is exactly Deming's point. probability has use, inferential statistics not so much...

I do use confidence intervals for categorical data in my graphs, but since I always have replication in my studies the overlap of the intervals is irrelevant - I use basic probability theory for determining if a difference exists or it doesn't. since there is really only 1 data point per subgroup for categorical data the confidence intervals serve to display the relative sample sizes of the different subgroups....
 

Ronen E

Problem Solver
Staff member
Moderator
#57
For awhile some were promoting confidence intervals instead of p values. confidence intervals are marginally better - if you graph them - but hey still fall victim to the same misuses the the p value does. Yes, it does indict much of inferential statistics which is exactly Deming's point. probability has use, inferential statistics not so much...

I do use confidence intervals for categorical data in my graphs, but since I always have replication in my studies the overlap of the intervals is irrelevant - I use basic probability theory for determining if a difference exists or it doesn't. since there is really only 1 data point per subgroup for categorical data the confidence intervals serve to display the relative sample sizes of the different subgroups....
I start to feel the sea floor dropping beneath my feet, that is, I feel I'm quickly getting out of my depth... I'm not a professional statistician, just an engineer with a keen (practical) interest in statitstics/probability & maths.

I'll just ask a practical question: In their book Statistical Intervals (2nd Ed. 2017), Meeker, Hahn & Escobar argue that Confidence Intervals for quantiles are "similar but better" (my words) than Tolerance Intervals for the same Coverage. Given that you (seemingly) reject the use of Confidence Intervals in general, does that mean you also think Tolerance Intervals are seriously flawed (i.e. similar to p-values)?
 

Bev D

Heretical Statistician
Staff member
Super Moderator
#58
Miner can provide a better expalanation than I can, but I’ll give it a start. Confidence intervals (not to be confused with confidence LEVELS) display the amount of precision of the AVERAGE of a data set. Tolerance intervals display the likely range of some portion of the INDIVIDUAL VALUES of the population. Two very different things. I don’t use tolerance intervals.
 

Miner

Forum Moderator
Staff member
Admin
#59
Miner can provide a better expalanation than I can, but I’ll give it a start. Confidence intervals (not to be confused with confidence LEVELS) display the amount of precision of the AVERAGE of a data set. Tolerance intervals display the likely range of some portion of the INDIVIDUAL VALUES of the population. Two very different things. I don’t use tolerance intervals.
Without going into painful detail, you gave a good summary of the differences. I do not use tolerance interval either. In industries that have stringent requirements, they have very limited application.
 

Miner

Forum Moderator
Staff member
Admin
#60
I have stayed out of the p-value discussion till now, but finally couldn't resist jumping in. I completely agree that p-values have a high potential for misunderstanding and misuse by many people. I believe that much of the misuse arises from the lack of planning and poor experimental design. A key element is to determine in advance the correct sample size to detect a practically significant effect size. When this is not done, we see ridiculous claims of significance to things that have such a small effect size that no one cares (i.e., stop eating "x" and it will add two days to your life expectancy). This casts frequentist statistics into a bad light. However, we must also remember the many successes this approach has had over its inception. Those cannot be discounted simply because people have misused it. Remember that people have also misused Bayesian methods to the point that many courts have barred its use, not because the method doesn't work, but because it is so easily misused by inexperienced practitioners. Both methods have merit when used by people that know what they are doing, and have a great potential for misuse when used by people that do not. This is true of all tools. We frequently see SPC misunderstood and misused in this forum. No statistical method can stand alone, but must be supported by the scientific method and an understanding of the processes involved.
 
Thread starter Similar threads Forum Replies Date
W Deming's SoPK (System of Profound Knowledge) Discussion Philosophy, Gurus, Innovation and Evolution 220
W Deming's SoPK (System of Profound Knowledge) Challenge Philosophy, Gurus, Innovation and Evolution 66
S How to perform verification of the Statistical Analysis Software? Qualification and Validation (including 21 CFR Part 11) 2
M V&V phase: Justification of acceptance criteria (statistical method ) - (Medical Device) Design and Development of Products and Processes 2
Bev D Statistical Alchemy Misc. Quality Assurance and Business Systems Related Topics 1
K Looking for guidance to write an SOP on Statistical Methodologies? Statistical Analysis Tools, Techniques and SPC 7
M Minimum sample size - Guidance and statistical rationale Inspection, Prints (Drawings), Testing, Sampling and Related Topics 3
N Design Verification & Process Validation - Statistical sample sizes Design and Development of Products and Processes 2
John Predmore Interactive visualization through graphical simulation of statistical concepts Statistical Analysis Tools, Techniques and SPC 3
A Statistical Analysis - Check if these organisms at different concentrations affect the growth of wheat seedlings Using Minitab Software 4
H Statistical Techniques Procedure - What should be included Document Control Systems, Procedures, Forms and Templates 4
O Statistical justification of sampling size in V&V tests ISO 13485:2016 - Medical Device Quality Management Systems 5
optomist1 It’s time to talk about ditching statistical significance Statistical Analysis Tools, Techniques and SPC 6
Marc Steve Prevette's Statistical Process Control (SPC) "Library" Statistical Analysis Tools, Techniques and SPC 0
John Predmore A Balanced view of statistical tests Statistical Analysis Tools, Techniques and SPC 3
V Statistical basis and justification while comparing / changing sampling plans Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 11
S SPC (Statistical Process Control) for Unilateral Tolerance - Questions Statistical Analysis Tools, Techniques and SPC 6
S IATF 16949 9.1.1.3 Application of statistical concepts - Our technicians are quizzed for statistical knowledge IATF 16949 - Automotive Quality Systems Standard 3
K Please help identify appropriate statistical treatment Statistical Analysis Tools, Techniques and SPC 13
ScottK Statistical basis for 30 pieces for FAI 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 7
B IATF 16949 clause 7.1.5.1.1 - Statistical studies shall be conducted IATF 16949 - Automotive Quality Systems Standard 3
A Statistical Process Control and Inspection in Footwear Production Statistical Analysis Tools, Techniques and SPC 0
M IATF 16949 Cl. 7.1.5.1.1 - Statistical studies shall be conducted IATF 16949 - Automotive Quality Systems Standard 3
Steve Prevette Statistical Process Control Library Statistical Analysis Tools, Techniques and SPC 17
Marc Happy Birthday Statistical Steven - 2015 Covegratulations 10
L When are Statistical techniques not applicable? Service Industry Specific Topics 16
M FDA 21 CFR 820.250 - Does "valid statistical" always mean math? 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 6
R Common Statistical Errors Using Minitab Software 1
Y Statistical Analysis of Road Traffic Data Statistical Analysis Tools, Techniques and SPC 11
B Class II Medical Device Manufacturer - SOP for 820.250 Statistical 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 3
E Correct Statistical Test comparing 2 Groups Statistical Analysis Tools, Techniques and SPC 14
A Statistical Software Calibration using Ford's "Sample Calibration File" Statistical Analysis Tools, Techniques and SPC 8
J Defining Martial Arts and Gymnastics Statistical Techniques Statistical Analysis Tools, Techniques and SPC 4
J Capability Analysis - Unusual Statistical Distribution of my Proccess Capability, Accuracy and Stability - Processes, Machines, etc. 5
M PhD Thesis Data Statistical Analysis Methods Statistical Analysis Tools, Techniques and SPC 2
J Statistical Significance and SPC Control Chart Reports Statistical Analysis Tools, Techniques and SPC 9
N Statistical Quality Improvement Action for Small Batch Production Statistical Analysis Tools, Techniques and SPC 17
V Validation of macro - scripts - programs used in statistical software (Minitab-SAS... Qualification and Validation (including 21 CFR Part 11) 5
Moncia Statistical Process Control Crash Course - Question Quality Manager and Management Related Issues 10
H Statistical Models for Predictive Management of Software Processes Software Quality Assurance 2
A Statistical Correlation between ordered SKUs Statistical Analysis Tools, Techniques and SPC 8
S Minitab and Crystal Ball Statistical Analysis Software Using Minitab Software 13
M Determining if two different X's have any Statistical Significance on the Y's Statistical Analysis Tools, Techniques and SPC 4
I Statistical Stability for the PQ of Analytical Equipment Qualification and Validation (including 21 CFR Part 11) 1
F Statistical Comparison of Product: High Average vs. Low Range Capability, Accuracy and Stability - Processes, Machines, etc. 13
E Using ANOVA during the PQ Validation Run to evaluate Statistical Differences Statistical Analysis Tools, Techniques and SPC 4
W Gage R&R for gage pins used to inspect a hole ID called a Statistical Tolerance Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 3
B AS 9100C sec 8.2.4 - " Recognized Statistical Principles" meaning AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 7
R What is PSW - Statistical Process Package + Level 5 APQP and PPAP 7
O Is SPC (Statistical Process Control) always required? Statistical Analysis Tools, Techniques and SPC 4

Similar threads

Top Bottom