Apply control limits to a non-normal distribution

Nuno Sardo

Registered
Hello

I have a process parameter that is controlled this way:

If the equipment signal response is below a certain level, we know for sure that the result will be lower than 10% and thus reported as <10.

However, if a signal response is a little bit higher, then I have to actually quantify the value. In this cases, I usually get results around 5-6, which are still below the 10.

My data distbitution is something like this:

Sample size - 126
Analysis result | Number of events | Cumulative Count | Cumulative %
<10 | 97 | 97 | 77%
5 | 16 | 113 | 90%
6 | 9 | 122 | 97%
7 | 3 | 125 | 99%
8 | 1 | 126 | 100%

What would be the best distribution to fit this data? Exponential? Weibull? Gamma?

The purpose is to define an upper control limit, as the ones used for normal distributions in SixSigma (average+3sigma). How should I proceed in this case?

Thanks in advance for your support
 

Mike S.

Happy to be Alive
Trusted Information Resource
Who cares what the distribution is? A process behavior chart will work regardless.
 

Nuno Sardo

Registered
Yes, but since it isn't a normal distribution, how should I define my control limit?

My data has a significant positive skew (77% of the occurrances are below the equipment signal response threshold), so I don't think that simply apply average+3 sigma as a control limit would be correct in this case.
 

Bev D

Heretical Statistician
Leader
Super Moderator
Are you creating a control chart or looking for the upper limit of the distribution? These are different tasks and require different approaches.
 

Steve Prevette

Deming Disciple
Leader
Super Moderator
Hello
What would be the best distribution to fit this data? Exponential? Weibull? Gamma?

The purpose is to define an upper control limit, as the ones used for normal distributions in SixSigma (average+3sigma). How should I proceed in this case?

Thanks in advance for your support

It is a common misconception in Six Sigma that SPC only works on Normal Distributions. The original work by Dr. Shewhart in the 1930's, and as more recently published by Dr. Don Wheeler provides proofs that SPC is distribution free. So don't worry so much about the distribution of the data, plot the dots and evaluate using X-moving range or Xbar-R.
 

Nuno Sardo

Registered
Hey guys, thank you for the discussion and help so far.

I tried to keep my doubt simple (I am not good at that) and I probably omitted some important information.

This parameter is always analyzed first against reference Standard A. If the response signal is lower than 5% of that reference, then the value must be automatically disregarded, and the parameter should be reported as “<10” (specification limit of this parameter) – end of analysis.

However, if the response signal is greater than 5% of Standard A, then it must actually be quantified. This quantification requires the use of Standard B as a reference. This consists of a different compound from that of Standard A and it is not simply a different concentration of the same Standard.

This procedure is established by guidelines and is not an in-house method, so there is no option of doing things differently, even if they sound kind of weird.

(@supadrai) Does your signal response (assuming it is any value below "5") due to reaching the detection limit?
There is always signal above noise level, so it is above the detection limit. I guess it can be considered that a signal response of 5% of Standard A reference to be the quantification limit of the method.

However, there is no clear relationship between this and a quantified value using Standard B. In other words, it is not clear what would exactly be the result quantified using Standard B as reference, if the signal response of the sample was exactly 5% of the Standard A. For example, this year the lowest value quantified was “5”, but last year a had some samples with this parameter quantified as “4” and I could even find a “3” in older data.

(@Bev D) Are you creating a control chart or looking for the upper limit of the distribution?
Kind of both. The main objective is to determine the upper control limit for this parameter, based on the recorded values, and evaluate the process capability. I have the control chart, still lacking average and control limits:

Control Chart.png

I know that the green dots are actually higher than “0”, since they signal response is always above noise level. They should be also lower than “5”, since I have several samples that needed to be quantified using Standard B as reference and gave that value. However, I also have records of “4” and even one “3” in previous years.

As a result, I know that the average would lie somewhere in between 0 and 5, but I am not sure where. I should I calculate the average in this case? Is it even correct to do so, since two distinct reference standards are used?

And how about the control limit? Is average+3sigma still applicable since my data is positively shifted?

Thanks for your help,
 

Bev D

Heretical Statistician
Leader
Super Moderator
try an I, MR chart and exclude the values that are below 5% to standard A. treat these as null values not zeroes.
 

supadrai

Lawyer
@Nuno Sardo - So you have no way to determine the values represented by <10, yet you know they are without exception, non-zero values and <5.

Would it be reasonable to assume that if you had the rest of the 97 data points, it they would be normally distributed about some mean? I keep looking at what appears to be some nice tail data and it would be a shame not to put that to use.

EDIT: I guess there's a name for what I am thinking and a regression for modeling it. Left censored data. Tobit regression. Section 3.3 of the attached seems to be what I'm thinking. But far far out of my depth now. :)

EDIT: and thanks to the kind folks at /r/askstatistics - a nice corner of reddit
 

Attachments

  • TOBIT-2.pdf
    152.9 KB · Views: 1,332
Last edited:
Top Bottom