First calculation of control limits

Wujinn

Starting to get Involved
Let's say that I have a process where SPC was not used in the past, so I collect 100 measurements and I calculate LCL and UCL. My charts look as follows:
First calculation of control limits

And now I'm wondering what is a reasonable approach to decide which points are outliers that should not be considered when calculating control limits.
In this example, point 47 is equal to 41, which is just above the LCL. It affects the final LCL and UCL, yet falls within the control limits.

How do we decide which points should be treated as a special cause and excluded, so that we don't have too wide a tolerance at the beginning of data collection when proven and tested control limits are not available?

Is outlier test a good idea?
Do you check histograms?
 

Bev D

Heretical Statistician
Leader
Super Moderator
First start reading the works of Donald Wheeler at SPC.com. A great first article is
The Right and Wrong Ways of Computing Limits

To directly answer your first question DO NOT perform an outlier test. That is why the control limits are there. Even if point 47 is close to the lower limit it isn’t less than it. Removing it from the calculation will not substantially change your limits.

There is really no reason to check a histogram of the data. SPC is not dependent on any distribution. It doesn't’ need a Normal distribution. Also read Wheeler’s “Myths about Shewhart’s Control Charts”.

Assignable or special causes should be validated by physics and knowledge of what happened and not just whether or not they violate one of the control rules…It may be a signal that you have chosen the incorrect control chart or subgrouping scheme. In the case you show it appears that you have chosen the correct control chart.

This may be a simple mis-wording but it worth reminding everyone that Control limits are not tolerances. A point that violates any fo the control laws does not mean that parts are out of spec. Out of spec parts can be in control and out of control parts can be in spec.

SPC is not simple nor easy. It is not a simple ‘cut and paste’ thing. You need to study it, understand it and learn, learn, learn…it is as complex as any other technical concept.
 

Miner

Forum Moderator
Leader
Admin
Fully agree with @Bev D . The only question that could not be answered is whether you have definitively chosen rational subgroups. Your chart appears to be appropriate, but as Bev stated, you have to consider the physics of the process as well. Read Dr. Wheeler's articles on rational subgroups:
 

Semoi

Involved In Discussions
In this example, point 47 is equal to 41, which is just above the LCL. It affects the final LCL and UCL, yet falls within the control limits.
Although the standard deviation is not robust in the mathematical sense -- i.e. bounded no matter how large the outlier becomes -- it is rather robust in the practical sense -- not greatly affected, if the outlier is not "too" large. Just exclude the data point and convince yourself that this is the case.

How do we decide which points should be treated as a special cause and excluded
Special causes are not excluded, but they are investigated. This is the key idea behind spc charts. If the responsible person is not willing to investigate and thus optimise the process, there is not point of generating spc charts.
In my experience it is helpful to use such a point as 47 for the first investigation. Talking to the people who perform the measurement and the operators who change the input parameters on the manufacturing machines. You might learn that the real process differs from the one you expected. You might need to optimise your sampling theme.

Is outlier test a good idea?
No. SPC performs this test by using the 3 Sigma control limits. Using a second method, will just provide a second result. You end up discussing which result is "correct". Thus, you spend your time performing calculations and discussing them. However, you should spend your time optimising the processes.

Do you check histograms?
SPC-charts work without histograms. Thus, it is not necessary to plot the dataset in a histogram. However, it might help to understand the underlying process. Thus, my advices are:
a) Plot the dataset in several different ways (histograms, boxplot, multiple regression and qq plots of the residuals etc.) as each method highlights different components/aspects of the dataset.
b) Don't get lost in performing the statistical analysis, but focus your attention onto the practical conclusion and/or counter action.
 
Top Bottom