View Full Version : How to justify Widened Control Limits - No Assignable Cause scenario
wchuey 28th March 2007, 10:49 AM Hi fellow quality practitioners,
I am not sure if any one of you has encountered this problem before.
Based on the AIAG manual, control limits for SPC charts are set at +/- 3 sigma. However, as the process improves, the sigma decreases and the control limits start to get closer and closer to the centre line with each review.
I am currently in a situation where points fall out of the control limits with no assignable cause (the only logical explanation is that the control limits are too tight). :bonk: The Cpk of the process is greater than 4.
Is there a way in which I can justify for an widened control limit?
My colleague suggested increasing the control limits to +/-6 sigma, since the Cpk obtained at 6 sigma is greater than 2. I have always know the formula for Cpk as (USL-mean)/3sigma and was surprised to hear that when the control limits are set at +/- 6sigma, the Cpk for the process changes to USL-mean)/6sigma. Is this a statistically sound method?
Darius 28th March 2007, 11:37 AM My colleague suggested increasing the control limits to +/-6 sigma, since the Cpk obtained at 6 sigma is greater than 2. I have always know the formula for Cpk as (USL-mean)/3sigma and was surprised to hear that when the control limits are set at +/- 6sigma, the Cpk for the process changes to USL-mean)/6sigma. Is this a statistically sound method?
IMHO A big NO about control limits at +/- 6 sigma, but agree with the /6sigma change if you are going to use it. The reason about it is that the Cpk divides how far you are from the specs by the half of the process spread (3sigma in the original equation, and if you dear to change the process spead to 6sigma it must be it)
Your problem seems that your process is a chemical batch or something like that, so the autocorrelation is too high, making the control limits to tight. I saw once a chemical batch with a chemical concentration taken every 15 minutes, the variation between points where to small but because the specs where big, no control was needed until some point (the LSL), so something like a chainsaw chart happen and the control limits where almost the same value as the mean leaving to many points outside of control limits (almost all). There are 3 ways (4 if you like your first approach) to skin the fish.
1-to take take more time between measures
2-to add autocorrelation to your chart.
3-to take the trend into acount.
The numbers 1,2,3 is in order of complexity. My favorite ones is number 2. Altho some guys may say that with a bigger sample size (option number 1.5), IMHO it just hiddes the behabiur.
Check on autocorrelation factor, Donlad Wheeler wrote about it in his SPC books.
The autocorrelation factor affect the estimate for variation in a IXMR chart (individuals with moving range).
x y
__ __
s1 s2
s2 s3
s3 s4
r^2 = ((n Sxy - Sx Sy )^2/((n Sx^2 – (Sx)^2 )* (n Sy^2 – (Sy)^2))
the factor is
Factor =(1-r2)^-0.5
SD within sample with autocorrelation =SD within sample without autocorrelation * Factor:magic:
wchuey 28th March 2007, 09:00 PM Hi Darius,
We are using the Xbar-s chart with a subgroup size of 15 with wirepull strength data as inputs.
Steve Prevette 28th March 2007, 09:09 PM Is there a way in which I can justify for an widened control limit?
NO.
Are you sure you have calculated the current limits properly?
Now, there is always a chance for a false alarm when control charting. It shouldn't happen very often, but it can and will eventually happen.
Miner 28th March 2007, 09:49 PM Hi Darius,
We are using the Xbar-s chart with a subgroup size of 15 with wirepull strength data as inputs.
Have you tried reducing the subgroup size? You could take an existing chart and model new limits using a subgroup size of 10 and 5 until you get the level of responsiveness that you want.
wchuey 28th March 2007, 10:25 PM Hi,
There are 15 wires in a unit which is why the subgroup size is set as 15. I am looking into the autocorrelation factor as suggested by Darius. This is something new to me and would appreciate if you guys could direct me to any relevant websites. Thanks!
Tim Folkerts 28th March 2007, 11:19 PM You might want to look at this recent thread. It discusses a situation that was quite similar... http://elsmar.com/Forums/showthread.php?t=20694&highlight=150
You might consider a couple different control charts that focus on the variations of most interest to you.
* To check that the wires within each assembly are similar to each other, the S chart for the 15 values would be valuable. (The X-bar chart would have less meaning, since you might well expect it to be out of control due to the variations you see between subgroups.)
* To check that entire assemblies are similar to each other, an I-MR chart of the average value for the 15 wires might be appropriate.
As far a capability calculation, Ppk might be more appropriate than Cpk. In this case, it is pretty clear that there is variation between assemblies beyond the variations seen within each assembly. Since Cpk is based only on the tighter variations within each assembly, it will gave an inflated value for the overall capability of the process. Ppk is based on the overall standard deviation, so it gives a better estimate of the overall capability.
Tim F
Statistical Steven 29th March 2007, 08:13 AM A simple solution. Use an X/mR chart by plotting the means of the 15 units. The problem seems to be very tight within batch variability. By switching to a X/mR chart you can evaluate batch to batch variability.
jeffrey_Chang 29th March 2007, 09:07 AM I am currently in a situation where points fall out of the control limits with no assignable cause (the only logical explanation is that the control limits are too tight). :bonk: The Cpk of the process is greater than 4.
Hi wcheuy,
IMO, with a CPK of > 4, if it is correct, why do you still want to continue to monitor the wire pull. With such a good process capability, I would suggest you stop monitoring the product output; in this case, the wire pull, but to consider monitoring the process parameter; i.e. the input, that affect the wire pull instead. That would provide you with a more meaningful insight into your proces and should still continue to maintain your wire pull capability.
thks.
jeffrey.
Darius 29th March 2007, 01:51 PM Agree with Statistical Steven :agree1:
A simple solution. Use an X/mR chart by plotting the means of the 15 units. The problem seems to be very tight within batch variability. By switching to a X/mR chart you can evaluate batch to batch variability.
Agree with Jeffrey, that you need to know if is needed to do such charts, altho is needed for the right capability index determination.:cool: I must say that your friend comment about the capability index has some good points, most of practitioners just use the formulas without any thinking.:applause:
There is not much on the net about autocorrelation and control limits, I found this free article (thanks to asq).
http://www.asq.org/pub/jqt/past/vol32_issue4/qtec-395.pdf
But I tried to write down, how to do it in my last post.
being sample1 = s1, sample2=s2, etc.
x y
__ __
s1 s2
s2 s3
...
obtain the r2 (fisher correlation factor) and apply as I said before. For me worked as a clock-work. And altho Wheeler didn't went as far as control limits, IMHO it applies to capability index too.
or post some of your info, so it gives to us something to play.
Dave Strouse 29th March 2007, 02:27 PM Wchuey,
Take a look at MINITAB stat>control chart> variables charts for subgroups>I-MR-R/S((between/within). They implement the solution Tim suggsted. You can download a trial version if you don't have the program and you might be induced to buy it. These are designed for exactly this situation.
I have used it in a wafer fab to see across chip variation and between wafers. Worked pretty well.
If you don't have a gauge study on your wire pull, I strongly urge you to do that also. Same job, diffrent process in which I used the within and between charts, I was quite surprised at the results from the GR&R.
jeffrey_Chang 29th March 2007, 10:25 PM Hi,
This is something new to me and would appreciate if you guys could direct me to any relevant websites. Thanks!
Here's a pretty good article about autocorrelation.
http://www.qualityamerica.com/knowledgecente/articles/PAKautocorrelation.htm
thks.
wchuey 30th March 2007, 02:24 AM Hi everyone,
:thanx: for all the help!!!
The question I have right now is how do I prove that the process is autocorrelated?
I used Stat>TimeSeries>Autocorrelation in Minitab and was stumped by the term "Lag", I just clicked the default value which is 1. What is the significance of this input?
I also followed Minitab's help file to compute the cumulative probability function, once again, I am not sure what to input for the lag and the LBQ value as it appears that I have 148 values! Correspondingly, I also have 148 p-values!:biglaugh:
As extracted from the help file...
Step 1: Compute the cumulative probability function
1 Choose Calc > Probability Distributions > Chi-Square.
2 Choose Cumulative Probability.
3 In Degrees of freedom, enter 6 (the lag of your test).
4 Choose Input constant and enter 56.03 (the LBQ value).
5 In Optional storage, enter Cumprob. This stores the cumulative probability function in a constant named Cumprob. Click OK.
Step 2: Compute the p-value
1 Choose Calc > Calculator.
2 In Store result in variable enter pvalue.
3 In Expression, enter 1 - 'Cumprob'. Click OK.
I have attached the Minitab file as while as the control charts generated.
jeffrey_Chang 30th March 2007, 03:22 AM Hi everyone,
:thanx: for all the help!!!
The question I have right now is how do I prove that the process is autocorrelated?
I used Stat>TimeSeries>Autocorrelation in Minitab and was stumped by the term "Lag", I just clicked the default value which is 1. What is the significance of this input?
Lag = Period.
It is the time period between each observation and the one observed immediately following it. Example, the sampling interval between samples is 1 min, in this case, lag =1 means one sample is taken every 1 min, 2 lag apart = one sample taken every 2 mins and so on.
From your attached ppt, lag 1 showed sign of statistical significance as it is outside of the 95% confidence bands (the red color lines) indicating there is evidence of non-randomness. ACF value would fall between -1 and 1 with value closer to these limits signifying strong autocorrelation. The lag 1 ACF value for your plot is approximately 0.3, this tells me although it is statistically significant but the departures from randomness (autocorrelation) is relatively mild and weak.
The plot, however, did indicate other lags that is out of the confidence limits but I am suspecting that some of these lags are due to noises.
You can try what steven has mentioned to evaluate the batch to batch variation by doing IMR.
thks.
jeffrey.:)
wchuey 30th March 2007, 04:44 AM Hi Jeffrey,
Do I need to re-arrange the data when performing the autocorrelation study?
Right now, the 1st 15 data points belong to the 1st unit, the next 15 belong to the 2nd unit and so on.
The lag between the 1st and 2nd data point is definitely different from the lag between the 15th and 16th data point as the 2nd unit is pulled 2 hours later. :truce:
Rgds
CH
Tim Folkerts 30th March 2007, 10:14 AM Lag refers to the how far back on the list of datat you are looking. A lag of 1 means you are seeing if you can predict the next value based on the value immediately before it. A lag of 2 means you are trying to predict the next value based on the one two before it, etc.
For example, suppose you were trying to predict the # of people visiting a beach.
I would expect a strong autocorrelation for a lag = 1. If very few people visited the beach one day (in the middle of winter, perhaps), then I would feel pretty confident predicting that the next day would also have low attendance. Any day with a larger than average attendance (most likely in the summer) will most likely be followed by another day with above average attendance.
I would also expect a strong autocorrelation with a lag = 7. If I know how many people visited last Saturday, I have a pretty good guess how many will visit this Saturday.
Finally, I would also expect a strong autocorrelation with a lag of 365. If I know the atendance on July 14 last year, I could make an educated guess about the attendance on July 14 this year.None of these predictions will be perfect, but they will be much better than just picking a random number. If you took all the attendance figures from the previous year, threw them in a bowl and picked them at random to make your predictions, you would do a whole lot worse than assuming that the attendance will be the same as 1 day ago, or 7 days ago, or 365 days ago.
Note that there can be negative correlations, too. If you have a row of people lined up male-female-male ..., then if one person is taller than average, I would predict that the next person is shorter than average.
For a truly random process, knowledge of the previous measurement will not help at all to predict the next value, so no autocorelations should be significant. If I flip a coin 5 times and it comes up heads all five times, that knowledge does nothing to help me predict the next flip.
For your wires, if you know the properties of one wire, then you will have a good guess about the next wire, so a correlation with a lag of 1 (or 2 or 3 or 4) may well be highly significant. This prediction will fail when you go from wire 15 (or 30, or 45... ) to wire 16 (or 31 ...), but 14 out of 15 times it will be a pretty reasonable prediction.
If you rearrange the data randomly, then the autocorrelation should dissappear, because there is now no way to guess about the next value (at least nothing better than drawing numbers from a bowl).
Tim F
P.S. As I understand it, the lag doesn't specifically depend on time, just on the order within the list. You are comparing each item to the one before, whether the time between the measurements was 7 minutes or 6 hr or 1 year. Even if the first 15 were measured at about the same time and the next one was 2 hr later, the calculations treat them as being "1 unit" apart, because they are listed consecutively in the data.
jeffrey_Chang 3rd April 2007, 12:32 AM Hi Jeffrey,
Do I need to re-arrange the data when performing the autocorrelation study?
Right now, the 1st 15 data points belong to the 1st unit, the next 15 belong to the 2nd unit and so on.
This has already been answered by Tim. You don't rearrange the data, just leave it in its ordered sequence.
P.S. As I understand it, the lag doesn't specifically depend on time, just on the order within the list.
Tim, I agreed.
Lag in the ACF should be in its ordered sequence. The explanation using time series is just an illustration and it could be time based and or due to homogeneous batches etc.
Wchuey,
For typical wirebonding, it would take approx 1 sec for 4 to 10 wires interconnection. At such a fast rate of bonding, I do not foresee a large difference in bond pull between the 15 wires in the same IC. However, I'm not sure what's the rational subgroup that you have devised. Based on the limited information, you averaged the 15 wires bond pull and plot it on a Xbar and S chart every 2 hours. Then you plotted the ACF for individual wirebond pull; i.e. lag 1 is 1st wire, lag 2 is 2nd wire and so on.
A simple solution. Use an X/mR chart by plotting the means of the 15 units. The problem seems to be very tight within batch variability. By switching to a X/mR chart you can evaluate batch to batch variability.
You might want to consider what Statistical Steven has advised and that is to plot the average of the 15 wirebond pull as a single data point and monitor using the IMR chart.This is the batch means chart.
Now you have only 15 wires, what happen if you are monitoring high I/O counts (For wirebonding, IO counts can go as high as < 500; i.e. < 500 wire bonds) then plotting individual value will be a real challenge :-)
Sorry, I didn't realised until now that you have already done the above. Let me try to catch hold of a copy of Minitab and work out the data you posted.
Then calculate and plot the ACF based on the average of the 15 wirebond pull to check for autocorrelation if needed.
One question I have is are you seeing a lot of OC at your 100% final test and or Rel test over time for that wirebonder? If No, then perhaps your CPK is really as what you have mentioned to be > 4 because there is a strong correlation between the final test results and the CPK value.
Otherwise, if yes there's a lot of OC fails, then your CPK values for that wirebonder does not make sense.
thks.
jeffrey.:thanx:
jeffrey_Chang 3rd April 2007, 06:42 AM Hi Jeffrey,
Right now, the 1st 15 data points belong to the 1st unit, the next 15 belong to the 2nd unit and so on.
Rgds
CH
Hi CH,
I'm baffled and pls do advise. In your attachment, is the MC6_AM values in your attachment individual wirebond pull reading or the average wirebond pull?
Your abovementioned indicates to me that the MC6_AM values seem to be individual reading of wirebond pull?
So which is which? :confused:
thks.
jeffrey :thanx:
wchuey 3rd April 2007, 10:00 AM Hi Jeffrey,
The data in the Minitab file are the individual wirepull readings.
I am also attaching the control charts & conclusions I have deduced thus far.
To test for autocorrelation, I have set the lag as 1. Rationale for this is that the die and substrate used within a unit is a constant.
Rgds
CH
jeffrey_Chang 12th April 2007, 06:54 AM Hi Jeffrey,
The data in the Minitab file are the individual wirepull readings.
I am also attaching the control charts & conclusions I have deduced thus far.
To test for autocorrelation, I have set the lag as 1. Rationale for this is that the die and substrate used within a unit is a constant.
Rgds
CH
Hi wchuey,
The homogeneous die and substrate, the fast bonding time for the 15 wires in a unit and the relatively short sampling interval will cause the autocorrelation.
I'm now trying to fit a suitable time series model to remove away the autocorrelative structure in the data and then perform a residual plot. I'll get back on this when it is done. Actually, the problem you are facing is a good example for me to illustrate to the participants of my company SPC training, autocorrelation in IC wirebonding.
Would there be an issue if you only sampled 1 unit per lot instead of once every 2 hours? How long will it take to complete an average lot?
thks.
jeffrey. :)
wchuey 13th April 2007, 04:31 AM Would there be an issue if you only sampled 1 unit per lot instead of once every 2 hours? How long will it take to complete an average lot?
thks.
jeffrey. :)
The lot size is not constant and varies from few hundred to few thousand. We are not comfortable with sampling 1 unit per bonder per lot. Based on the UPH, 456 units are produced every 2 hrs.
jeffrey_Chang 17th April 2007, 09:02 AM Hi,
Using statgraphics, I've try to create an ARIMA model to remove away the autocorrelation. Attached is the ARIMA model selected based on Statgraphics recommendation using AIC. The ARIMA chart plotted with the model, now only showed 9 OOC points.
If times allowed, I'll try to fit the model and plot the residuals using EWMA. I wonders what will be the outcome.
As I'm pretty new to time series forecasting, pls do comment and advise.
thks.
jeffrey.:)
Tim Folkerts 17th April 2007, 11:03 AM The strong autocorrelation at lag =1 indicates to me that wires next to each other in a bundle tend to be similar. So wire 2 tends to be like wire 1 and wire 3 tends to be like wire 2, etc. Given this trend, it would be expected that wire 3 would be at sort of like wire 1, and wire 4 would be sort of like wire 2, etc. Thus a slightly weaker autocorrelation would be expected (and is indeed observed) for lag = 2 than for lag = 1 (and the autocorrelation for lag = 3 would be slightly weaker still).
The strong autocorrelation at lag = 15 indicates that bundle 1 is similar to bundle 2, etc. Using the same logic as above, we would expect (and indeed we see) that bundle 3 is similar to bundle 1, etc - ie the autocorrelation at 30 is significant, but weaker than at 15.
The logic can be extended to other lags. For example, if neighboring wires are similar and neighboring bundles are similar, then wire 1 of bundle 1 should be similar to wire 2 of bundle 2. This explains the significant autocorrelation at lag = 16.
Tim F
jeffrey_Chang 18th April 2007, 01:09 AM The strong autocorrelation at lag =1 indicates to me that wires next to each other in a bundle tend to be similar. So wire 2 tends to be like wire 1 and wire 3 tends to be like wire 2, etc.
The strong autocorrelation at lag = 15 indicates that bundle 1 is similar to bundle 2, etc.
Tim F
Hi Tim,
You are right, the bonding time for all 15 wires would only take a few seconds. With such a short bonding time and coupled with the homogeneity of the die and substrate in one unit, wire 1 to 15 would definitely be strongly autocorrelated.
I'm also suspecting that the sampling interval of 2 hours between units is basically to close to each other and thus will likely result in the autocorrelation as well.
However, I do have one question here, since bundle 1 and bundle 2 and subsequent bundles will show autocorrelation. Is this signalling some form of seasonal pattern?
thks.
jeffrey :thanx:
jeffrey_Chang 27th April 2007, 06:27 AM Hi CH,
I've done a simple writeup on the introduction of autocorrelation. However, it is just some basic information on the topic. There are many other areas that have not been covered esp on how to fit the correct model to the correlated data and on MCEWMA etc. But do enjoy.
Welcome all comments.
thks.
jeffrey.:)
wchuey 9th May 2007, 09:42 AM Hi Jeff,
Thanks for the help! I downloaded a trial version of Statgraphics and followed the step by step instructions and managed to get the same values as you did. I tried to understand the maths behind ARIMA but :nope:
In Statgraphics, under analysis options, there is an option to choose either "data with long-term limits" or "data with one-step limits". Both generate different limits, what is the difference?
Also, on the shopfloor we are using an SPC software where there is no provision for ARIMA analysis. The engineers set the control limits while operators input the data. Is it possible for me to set the control limits for the X-bar chart using control limits generated by ARIMA analysis - data with long-term limits? What then happens to the s chart?
Rgds
CH
jeffrey_Chang 10th May 2007, 11:57 PM Hi Jeff,
Thanks for the help! I downloaded a trial version of Statgraphics and followed the step by step instructions and managed to get the same values as you did. I tried to understand the maths behind ARIMA but :nope:
In Statgraphics, under analysis options, there is an option to choose either "data with long-term limits" or "data with one-step limits". Both generate different limits, what is the difference?
Also, on the shopfloor we are using an SPC software where there is no provision for ARIMA analysis. The engineers set the control limits while operators input the data. Is it possible for me to set the control limits for the X-bar chart using control limits generated by ARIMA analysis - data with long-term limits? What then happens to the s chart?
Rgds
CH
CH,
In short, “data with long term limits” monitors the long term behavior of the process. The centerline is at the estimated process mean with control limits at +/- the estimated process sigma.
“Data with one-step limits” has a moving control limits that monitors the noise at each time period, t-1.
ARIMA analysis models away the autocorrelation within the data and then plotting and monitoring the residuals on any conventional control charts.
Once you have identified the time series model using Minitab or Statgraphics, you can then use Excel spreadsheet or your SPC software to determine the fitted value, residual from the model equation and plotting the residual on the Xbar – S chart in this case.
No, you cannot use the long term limits from the ARIMA chart for your Xbar – S chart.
The most difficult part of the ARIMA analysis is to fit the most appropriate time series model to the correlated data. Note that for your set of data, the model we fitted is not good enough as there is still some autocorrelation that our model has not accounted for, probably due to the seasonal pattern.
Thks.
Jeffrey.:thanks:
|
|