Correlation between 2 Variables - Potential X's that might affect yield percentage

M

manojgeorgethomas

Hi,
I am a green belt and am currently working on a project to improve the yield for a process that fills medical vials. I have collected data on the yield for the last 1 year. I am trying to now figure out what could be the potential X's that might affect the yield percentage .. i have data around
1) Density of powder being filled ( microns-continous)
2) Supplier ( Discrete)
3) Campaign Number (Discrete)
4) Line Number ( Discrete)

What should be my next step if i need to go ahead and analyse the data and what tools should i be using ?

Any help is appreciated.
Rgds,
Manoj
 
A

AdamP

Re: Correlation between 2 Variables - Potential X's that might affect yield percentag

Hello Manoj!

I think I understand your question, but I may need to make some assumptions and ask some questions as well to help you most effectively.

First, I'll assume your GB project is following the DMAIC flow and since you're asking about correlation between 2 variables, you are most likely nearing the end of Measure or perhaps working on Analyze at this point. Hopefully not still in Define based on what you're asking. :)

If I'm close, you should have asked and answered a few questions by now - 'answered' via tool use. Specifically, you might have a detailed "as is" process flow map so you can "see" how these vials actually get filled. Given the whole 'Y = f(x)' concept, I assume yield is your primary metric and therefore the "big Y". A very common Measure phase exercise is to assess cause and effect via a cause and effect diagram (aka Ishikawa, or Fishbone chart). By asking your team "Why do we see variable yield?", you should be able to (or already have) populate a C&E diagram with many potential causes (x's).

Since you are already collecting data on several, look to see if any of these show up as likely root causes on your C&E chart. You should be including subject matter experts - people who run this process and have a fair amount of knowledge to complete the C&E.

The point here is that based on the way you've asked your question, we have no way of knowing if you have used any process-based tools to populate the list of probable root causes - hopefully you have and the list of data elements is not just what you could collect easily.

Once you do have belief that you have the most likely subset of X's, you can run simple scatter plots (aka x-y charts) to see graphically if any apparent correlation exists between your X's and the Y. I would also suggest that you get help from a BB or MBB to determine if there is also correlation between the X's as that can pose a problem with your subsequent analysis.

I hope you follow up with this - happy to help you work it.

Cheers,

Adam
 
Last edited by a moderator:

Tim Folkerts

Trusted Information Resource
Re: Correlation between 2 Variables - Potential X's that might affect yield percentag

I always advocate a visual approach to start. I would plot graphs of yield as a function of each of the variables to see what affect they might have. Many times you can see some simple relationship that is not immediately obvious from statistician tests (like a few poor data points or a quadratic fit)

If you have other data handy, you might make a plots for those too. For instance if the time of day is recorded or the day of the week, you might try plotting that as well - if only takes about 20 seconds to generate a graph if the data is in Excel and it might tell you something interesting.

Once you have taken a quick look at the data, then ANOVA would be a good place to start for the discrete variables (or a simple t-test if there are only two different categories). For the continuous variable, then a regression analysis would be appropriate.


Tim F
 
M

manojgeorgethomas

Re: Correlation between 2 Variables - Potential X's that might affect yield percentag

Thanks for the responses.

The filling of the vial basically happens in pretty sterile conditions.. There are 2 quantities of vials which are currently filled ( 1gm and 2 gm)..the process followed by both is pretty much the same. Visually the losses seem to happen at multiple stages during the process ..
A) There is loss when the powder is dumped into the machine
B) Vials are lost as part of the manufacuturing process..damages/crimping issues
C) As part of visual inspection we loose a small percentage.

I am in the process of collecting more data around these .. in the meantime , i was trying to ascertain if some of the losses could be
1) Due to the density of the powder
2) Supplier Supplying us the material
3) quantity being filled in the vial
4) Campaignes being run..

and hence wanted help in figuring out some tools which i could use to determine potential issues for each of these 4 types or a combination of these factors.

Rgds,
Manoj

since i had some data around these ....
 

Miner

Forum Moderator
Leader
Admin
Re: Correlation between 2 Variables - Potential X's that might affect yield percentag

Hi,
I am a green belt and am currently working on a project to improve the yield for a process that fills medical vials. I have collected data on the yield for the last 1 year. I am trying to now figure out what could be the potential X's that might affect the yield percentage .. i have data around
1) Density of powder being filled ( microns-continuous)
2) Supplier ( Discrete)
3) Campaign Number (Discrete)
4) Line Number ( Discrete)

What should be my next step if i need to go ahead and analyse the data and what tools should i be using ?
I agree with Tim. Start out with exploratory data analysis. There are numerous approaches. I recommend trying a multi-vari or Box-plot approach for the discrete variables and a scatter-plot for the continuous variable, but there are other approaches that work equally well.

If you visually detect a potentially significant factor, you may verify it using tools such as a t-Test, ANOVA, ANOM or DOE to name a few for the discrete variables. The continuous variable may be evaluated using correlation and regression.
 
A

AdamP

Re: Correlation between 2 Variables - Potential X's that might affect yield percentag

Thanks for the responses.

Visually the losses seem to happen at multiple stages during the process ..
A) There is loss when the powder is dumped into the machine
B) Vials are lost as part of the manufacturing process..damages/crimping issues
C) As part of visual inspection we loose a small percentage.

Based on this observation, your situation sounds less like a "What statistical correlation test do I use?" issue and more like an opportunity to pay close attention to the sub-processes you've noted above. Keep in mind, we gather data via process observation, not by pulling a report from SAP. :)

If you visually see losses due to dumping raw powder into the machine, then you should intimately understand that part of the process to determine which input variable are controllable - but apparently not being controlled now.

If you see losses due to crimping, that's likely an issue with the set-up and feeding process, so again, have you fully studied that sub-process to identify via root cause analysis tools (cause & effect again) which are the likely drivers there?

If you lose a small percentage due to a visual inspection, then your inspection process is open for study.

Study your process - running the stats will only
confirm your observations once you know what data to collect and use.

Cheers!
 
Last edited by a moderator:
Top Bottom