Control Chart and Limits For Large Sample Size



Hello guys,

How do you control chart if subgroup s/s is large... like 150, without disregarding the w/in subgroup (or sample) variation? What type of control chart and how to compute control limits? Thx.:confused:

Tim Folkerts

Super Moderator
The fact that the subgroup size is large does not invalidate the various control charts. In fact, everything else being equal, more data is almost always better than less data!

With large subgroups, S charts give a much better estimate of the standard deviation than R charts would, so I agree with Benjamin's suggestion for Xbar-S charts.

With such a large subgroup, you will get very narrow control limits on the X-bar chart. However, when you are averaging so many points, the average should be quite close to the true mean. If the Xbar chart shows many OOC points, that means that many of the subgroups do indeed show more variation than would be expected due simply to random variations. With big subgroups, you can detect small variations. Now, these small variations that are statistically significant may not be economically significant.

If you expect some variations between subgroups in addition to the variations within a subgroup, then you might calculate the average for each subgroup and plot that set of points on an I-MR chart. Basically you are admitting that there is long-term variation that is not accounted for within the subgroups, and you are just looking at the long-term trends. An OOC point on the I chart tells you that a subgroup's mean shows an unusual amount of variation (and the usual variation is already larger than predicted within the subgroups).

Tim F


if we have large data sets of continuous individual data e.g. above > 500 value, should will still use estimated deviation or we can use normal deviation over the whole data to calculate UCL and LCL?

UCL / LCL based on estimated deviation (CL+/-2.66MR) in our case are more conservative than actual deviation over general population.

Bev D

Heretical Statistician
Staff member
Super Moderator
the phrase 'estimated standard deviation' is a misnomer. you must still calculate the control limits using the within subgroup standard deviation. this is how SPC works.

If the process is homogenous the within subgroup standard deviation adjusted by d2 or c4 will be a good estimate of the population standard deviation. (hence the term 'estimate' but using that term instead of within subgroup SD leads to many mis-interpretations.). IF the process is homogenous the subgroup averages will vary by the average +/- 3(within SD/c4)/square root of n. If the process is not homogenous the averages will not comply with the calculated limits. that is how SPC works. trying to calculate the population SD directly sidesteps this critical part of the process.

I am curious tho. what do you mean by having large data sets with n>500? are the data sets your 'subgroups'? or are they for different processes? can you describe what you are trying to do?

Top Bottom