Advice needed: multi‑response optimization with poor model fit

Rowab

Registered
Hi all,

Minitab and DoE novice here.

I'm working on an electrochemical system and trying to optimize two outputs simultaneously:

1. Increase signal suppression (SS (%))
2. Decrease potential peak current (ipc)

I started with a 2^4 factorial design. For SS, the initial model looked overfit, so I reduced it hierarchically, which improved things. The reduced model suggested curvature, so I followed up with a CCD using only the significant factors. However, the CCD gave a poor model summary and showed no significant factors:
S R-sq R-sq(adj) R-sq(pred)
12.5364 27.65% 0.00% 0.00%

Interestingly, my system seems to be quite variable (there's probably some complex chemistry going on that I'm not aware of), the range and degree of response is smaller in the CCD compared to the initial factorial. ipc, on the other hand, fits a linear model reasonably well.

I would very much appreciate any advice on next steps or best practice for this kind of outcome, as I'm unsure of how to make practical use of the optimization results in this case.

Happy to share any extra details if that may help.

Many thanks in advance!
 
Elsmar Forum Sponsor
I think I need more explanation to offer helpful comments. @Rowab identified two outputs to optimize: SS and ipc. What are your inputs, which one has 4 levels and which has 2 levels and what are those levels (to give us a sense of scale)? How many runs? What is your ideal function? Is signal suppression the function you seek to optimize? Can you copy and paste your P diagram?

As a admitted DOE novice, usually one picks one output to optimize and hold the other variable of secondary importance. How do you measure the potential peak of current that potentially happens?
Central Composite Design is typically not tackled by DOE novices. How did you conclude your initial model looked overfit when you started with two input factors?

The D in DOE refers to Design. That means upfront consideration of alternatives and tradeoffs, and thoughtfully making experimental design decisions to best capitalize on known science and circumstances. The statistical significance may not be valid if you take someone elses data and drop it in multiple iterations of Minitab, fishing for a combination that looks good.

I'd advise measuring all quantities in fundamental engineering units. Electrical Current is a fundamental engineering quantity, measured in units of amperes (or milliamps). A ratio (e.g. percentage of SS) is division of two other quantities. If the numerator increases, that increases the ratio; if the denominator decreases, that increases the ratio. When you gather all your data, you may confound increases and decreases which can muddy the water in what you hope to understand. Better to measure signal in engineering units (perhaps millivolts), while holding input signal constant, or else make preselected input signal levels an input factor.
 
@John Predmore 2^4 means 4 input variables at 2 levels each. It also means that this is a full factorial with 16 total runs. Minitab's Response Optimizer theoretically allows you to simultaneously optimize multiple variables, but my experience has been that it is highly sensitive to the quality of the model and the optimization parameters used.

@Rowab My suspicion is that you either have a lurking variable, an overlooked interaction or a confounding noise variable causing your contradictory results. I also agree with John's recommendation to stick with fundamental units instead of ratios whenever possible. Ratios are sensitive to variation in both the numerator and the denominator. One last recommendation: assess the repeatability and reproducibility (GRR) of your measurement device. An inadequate measurement device has been the downfall of many experiments.
 
@Rowab When you ran your CCD, did you add axial points to the original 2^4 design and most importantly, did you use center points in both experiments. If you did use center points in both, you should now have a block term in the model. If this block term is significant, it means that something changed between the two experiments. If you ran the entire CCD as a separate experiment, try combining the two data sets together as two different blocks and analyze it using GLM or regression analysis.
 
@Miner and @John Predmore both have excellent points.

You should always perform your MSA first.
Also never use ratios, CV, or any other type of transformed data. Use the raw response number.
Learn how to GRAPH your raw data (the individual response variable data points) in a variability chart and LOOK at that before looking at any statistical output summaries.

I am from a global leader in veterinary diagnostic devices. From single use rapid tests to complex analyzers. I know the above things are absolutely true.

I had a ‘student’ once. A very intelligent and well respected scientist who was tasked with solving a long term blood characteristic problem with excess variation in a complex hematology instrument. After a year of perfroming every experimental structure he could find, I finally got to the root of the problem (when he finally listened to me). I asked to see his MSA. Yeah. Hadn’t done one. We constructed a study and, lo and behold, his problem was that the measurement of the characteristic by the instrument itself was the source of the variability. After that it was a simple downhill slide to the causal mechanism and resolution of the problem.


DoE training - and especially statistical software often tries to sell itself - or is interpreted as such by the nature of how the training is done - as a shortcut to knowledge. It isn’t. While it is important to understand the various Experimental structures and how to crunch them in statistical software it is more important to understand that there is an iterative process to knowledge that often requires several types of study designs to full undrerstand or narrow down a system to 1-3 controlling factors.
 
Back
Top Bottom