|
This thread is carried over and continued in the Current Elsmar Cove Forums
|
The New Elsmar Cove Forums
|
The New Elsmar Cove Forums
![]() Statistical Techniques and 6 Sigma
![]() Deming vs Statistical Hypothesis Testing
|
| next newest topic | next oldest topic |
| Author | Topic: Deming vs Statistical Hypothesis Testing |
|
Marc Smith Cheech Wizard Posts: 4119 |
Newsgroups: misc.industry.quality Subject: Re: Statistical Hypothesis Testing Date: Mon, 17 Apr 2000 01:33:56 GMT Organization: Deja.com - Before you buy. Greetings John, You said: Deming said: "The student should avoid passages in books that treat confidence intervals and tests of significance, as such calculations have no application in analytic problems in science and industry." (W. Edwards Deming, Out of the Crisis, page 639.) Deming was advocating a doctrine, still current among statistical/management gurus, that distinguishes between analytic methods and enumerative methods. I can't make sense of it myself, even though I have tried. Advocates of that doctrine classify statistical hypothesis testing as an enumerative method - the kiss of death. I consider the analytic versus enumerative thing to be a false dichotomy. Also in "Out of the Crisis", Deming said: "Analysis of variance, t- test, confidence intervals, and other statistical techniques taught in the books, however interesting, are inappropriate because they bury the information contained in the order of production." (W. Edwards Deming, Out of the Crisis, page 132.) Lastly, in "Out of the Crisis", Deming said: "... But a confidence interval has no operational meaning for prediction, hence provides no degree of belief in planning." (W. Edwards Deming, Out of the Crisis, page 132.) I can't refrain from writing one last quote, this one from Ernest Hemingway: "In order to be a great writer a person must have a built- in, shockproof crap detector. Sincerely, Stan Hilliard ============= IP: Logged |
|
Marc Smith Cheech Wizard Posts: 4119 |
Newsgroups: misc.industry.quality Subject: Re: Statistical Hypothesis Testing Date: Mon, 24 Apr 2000 04:01:31 GMT Organization: Deja.com - Before you buy. Greetings John, Here is a snippet from page 132 of "Out of the Crisis" that might clarify what Deming meant by analytic techniques. "...analytic problems - planning for improvement of tomorrow's run, next year's crop" Later, on the same page, he continues his attack on hypothesis tests by criticizing the concept of statistical significance: "Degree of belief cannot be quantified as 0.8, 0.9, 0.95, 0.99. So- called probability levels of significance between method 1 and method 2 do not provide any measure of degree of belief for planning -- ie., for prediction." MY ANALYSIS -- The level of significance (alpha) is the complement of the numbers Deming lists: (1-0.8=0.2), (1-0.9=0.1), (1-0.95=0.05), (1- 0.99=0.01). More importantly, I believe that Deming's comment about the concept of significance is a red herring. His "degree of belief" is vague whereas "statistical significance" is a precise scientific concept. Significance is the probability (alpha) of a type 1 error -- the rejection of a null hypothesis (H0) when it is true. I think that the most precise way to describe statistical significance is with an "if-then" statement. That is, IF the null hypothesis is true, and you were to perform the hypothesis test repeatedly (using a new sampling of data from the same population each time), THEN H0 would be rejected alpha proportion of the time. MY CONCLUSION -- This is not a trivial issue because Deming and his disciples have influenced the management of many corporations in this matter, who in turn influence what training is available to their engineers. Deming apparently invented his own theory of enumerative versus analytic studies and used it to explain what was wrong with hypothesis testing. I don't think he provided any data to support his claim. I don't think he demonstrated his point. Sincerely, Stan Hilliard =========
IP: Logged |
|
Don Winton Forum Contributor Posts: 498 |
Personally, I find it hard to believe Dr. Deming actually objected to significance tests. He was, after all, a trained statistician. He also advocated having at least one trained statistician on the staff of organizations (I read that somewhere; do not remember where). But, just as any tool used incorrectly can ruin the effort (a srewdriver as a chisel, for example) significance tests can ruin information (t-testing on a 10 second cycle mold machine in a production environment). During the days when the modern quality movement was born, virtually everyone saw statistical quality control as a savior. Everyone and their brother (sister) wanted everyone else trained in SPC. The problem was, while everyone knew it worked, few had any idea WHY or HOW it worked. Thus, charts and graphs were everywhere. People begin applying these concepts to everything from production runs to 1st article manufacturing. The right tool for the right job? I guess what I am saying is this: Significance tests could be used just as effectively as the An example may be this: While qualifying a machine for production, a first run sample of 20 parts were produced and determined to have an acceptable process capability, say for example 1.0. This particular machine was an injection molding machine using a four-cavity mold. All was declared well and good with the world. But, shortly after entering production, the charts were all over the place. Cycle time was adjusted, material changed, temperatures adjusted, but to no avail. The original 20 parts were measured again. Sure enough, the data were the same. The staff quality engineer happened to be in the lab and asked, "Is there a difference between the cavities and/or the cycle data?" This question was treated with all the enthusiasm of Oliver Twist asking for more soup. "A little, but not much," was the reply. The data were presented to the quality engineer and examined. While the data between cycles were not significant, the data between cavities were! The above story has flaws, but it presents a point. Without significance testing, used correctly, SPC may not be adequate. Deming was trying to deliver his message to the masses, not just a few specialists. Significance testing was for the specialists, thus not the audience of his broad message. One other thing I would like to comment on. There are those that take Deming's words as the absolute truth. I am a Deming disciple, and even I know this is not so. Regards, IP: Logged |
|
Kevin Mader Forum Wizard Posts: 575 |
Don, I believe that Deming made his comment about having a statistician on board a number of times, I can recall that it was in both Out of the Crisis and The New Economics. As you are well aware of, he always promoted learning from a Master and not a hack. I have also wondered why as a Doctor of Statistics that Deming's comments about rejecting Tests of Hypothesis. I must admit that when I studied advanced statistics, the concepts of using One-tail/two-tail tests seemed pretty reasonable. I still think that there is plenty of value in these types of tests. However, after reading a paper by John Kitteridge on the differences of Analytical Problems and Enumerative Studies, I realized what was missing from the examples provided by Juran in his books. They did not speak about the system, the method, or predictability. In general, I believe that the masses are exposed to the superficial 5% of anything and begin to believe it, regardless of what the other 95% of the information speaks of. We are eager to jump to conclusions and Deming was aware of this folly. With this in mind, when I read the areas of Out of the Crisis noted in Marc's post, I keep in mind that Deming was probably making these statements realizing that the masses do not have the knowledge or understanding of the essence behind the curves and tails. I know I have a limited understanding. I also imagine that a PhD in Statistics probably is probably in a better position than me to make the assessment. I am a Hack by Deming's standards (but happily a work-in-progress)! Here is something that always gets a nerve: baseball statistician. Could there be a more severe bastardization of the term statistician? Baseball statistics are enumerative. The furthest I would go is to say that they marginaly satisfy the Law of Averages. Yet folks trust these numbers beyond their worth. Baseball manager's make decisions based on the numbers. What they are doing is tampering. Some folks make desk top calanders such as the one I have. Filled with tidbits of information, none of which has predictive powers. Fun: yes. Usefull: not very. But errors such as this type of interpretation are rampant in most walks of life. People do not really now much about a claim in the news stating that crime is down 10%. What changed? What test would one use to validate the results? Not enough is known in most cases, and we hastily jump to conclusions. It may be safe now to leave your house unlocked. Do you suppose? Who will win the election? What does the data support. How about the latest Census? I believe Deming may have been disappointed if he were still living. Your advice about balance and caution is as usual, sound. Balance in most everything I suppose is a good rule of thumb. Well, enough of my ramble, plus it is time to go to bed. Regards, Kevin IP: Logged |
|
bevdaniels unregistered |
Well, Deming was not particularly well known for his ability to articulate his thoughts very clearly in written form. Only Taguchi was worse - and he had the whole English-Japanese thing going against him : ) I'm not much better, but I'll take a shot at it. Disclaimer/Qualifier: While I have not studied Deming in great depth, I have a lot of experience in hypothesis testing in various manufacturing environments. I think there are several points to be made about "significance" and "confidence" levels as used in t-tests, f-tests, etc. Also, too many people don't understand the true statisical meaning of significance and confidence: they apply layman's (dictionary) interpretation to them...this is not correct. These terms have very specific, narrow meanings. teh user shoudl understand tham and use them correctly. As for the use of independent samples - I've seen that most people have little clue what this means. Many people assume that piece to piece variation is the largest component of variation and therefore assume that sequential pieces are independent. However, if lot to lot or set-up to set-up are the teh largest component, then sequential pieces are definitely NOT independent and the test has given a wrong answer regardless of the statistical levels chosen (garbage in, garbage out)... Deming has some good advice hidden in his statement about run order: this is the best way to determine waht consititutes independent samples. Looking at the Run order of any hypothesis test also protects against spurious associations (The real root cause "lines" up with your test samples...) Any hypothesis test is particularly exposed to this if the tests are not randomized and/or the test levels are confounded with other fators. Way too common in my experience. The last issue is that the classical t-test et al is based on summary data and can be influenced by a single extreme value data point that skews the average of on eof the test levels providing another false answer... One must still look at the individual points for validation that reality matches the "summary" data conclusion of the statistical test. Bottom line: Always plot your dat in time sequence AND look at the individual data points for nonrandom or strange patterns! (Unfortunately, most stats packages don't automatically do this - you have to deliberately do it yourself. I use excel) Unfortunately too many "trainers" and authors spend far too little time on the above topics (there's no math in there and it's tough to get articles published that dont' have lots of formulas!). THEy spend most of their time on the mechanicso fh the statistics and not on how to desing the test properly and collect the appropriate data... t-tests and f-tests will do the same job IF you plot the data properly and use your logic to interpret the results - don't just rely on the p value! IP: Logged |
|
John C Forum Contributor Posts: 134 |
The last three submissions seem to be on the right lines despite quotations listed earlier. Those ones seem to contradict others that I have seen. Below are quotes from notes I made, from a book about Deming (I have lost the source so I can't stand over them absolutely). It seems to me that Deming is not against hypothesis but is concerned about statisticians having insufficient knowledge of the process and consequently, applying hypothesis to unstable processes or to data which has no statistical relevance. Here's a sample; Formulating
a hypothesis and comparing against practise, is fundamental' understand
differences between enumerative and analytical problems. Statistical Theory
is vital for tests and experiments which is an analytical problem'.' I'm guessing that these relate to the fact that enumerative data can not predict. I don't claim to understand it. It is only in the state of statistical control that statistical theory aids prediction. An experiment in an unstable system yields data that can only be interpreted by knowledge of the subject matter.' So, first you need to establish the process is stable. Can you apply hypothesis to an unstable system? Can you make a system stable by applying a hypothesis? Statisticians must understand the system and how statistical theory can help optimise it.' All this implies that Deming is concerned about misuse of the statistics, as he is in many other areas to the point where he 'abolishes' things. He abolishes; mass inspection, lowest tender contracts, fear, barriers, exhorations, numerical targets, appraisal, short term profits, etc. It seems to me that we should think of his abolition' of hypothesis in the same way; It's not the literal interpretation, but the idea behind it that he is getting at. My best shot at what he means is; Too many of you are running around with a solution, trying to find a problem to fit it. Stop doing that. Consider the problem in detail, understand it and then apply the right solution' which, in many cases, is not hypothesis but the search for special causes. rgds, John C [This message has been edited by John C (edited 06 July 2000).] IP: Logged |
|
dnorthcutt Lurker (<10 Posts) Posts: 1 |
Deming gives a very detailed explanation of the difference between Analytic and Enumerative in his book, Some Theory of Sampling, (c 1950 Wiley, currently available through Dover). The entire 7th chapter deals with this topic. The following quote from the opening paragraph summarizes the distinction: "In the enumerative problem something is to be done to some portion of the contents of the bowl regardless of the reasons why that portion is so large or so small. In the analytic problem, on the other hand, something is to be done to regulate and predict the cause system that has produced the universe (city, market, lot of industrial product, crop of wheat) in the past and will continue to produce it in the future." Deming goes on to show that the two types of studies have different sampling errors. In an enumerative study, the sampling error can in fact be reduced to zero by conucting a complete census. In an analytic study, even a census still leaves us with sampling error, because the point of the study is to make a statement about the cause system, not the particular lot in question. To make statements about analytic studies, one needs multiple observations over time. Additionally, as has been mentioned already, the time order is critical, and failure to account for ordering can result in faulty analyses. I think these issues are what lead Deming to be so down on simple hypothesis testing. IP: Logged |
|
johno unregistered |
A lot of hypothesis testing that I run into seems to have conclusions of 'accept' or 'don't accept', which seems strange if one is willing to say 'don't accept' at 94% and 'accept' at 95%. A p value seems to make more sense, where one can say 'it would be significant at 94%' and let the customer decide if that is ok. IP: Logged |
All times are Eastern Standard Time (USA) | next newest topic | next oldest topic |
![]() | Hop to: |
Your Input Into These Forums Is Appreciated! Thanks!
