We have a separation process with two output stream. One stream is a final product (Slurry) and the other is a stream with impurities in it. I have done two things
(A) and
(B)
(A) Corelation
I corelated the quality parameter of final product with the impure streams parameter using MiniTab (Stat-> Basic Statistic->corelation), I initially got a few parameter with significant P-values and some with insignificant P-Values (obviously)
(B) Regression
Next, I tried regressing on a parameter (say Y) of impure stream with other parameters of final product (say x1,x2 and xn). I followed the following strategy.
I). Run regression (Stat->Regression)
II). Check if there are any points identified by MiniTab to be not following the regression equation.
III.)Remove these points i.e. points not following the regression equation.
IV.) Go to (I).
The resultant is P-Value of equation = 0.000,All points are confirming to the equation...
But surprisingly, a few of the parameter which has no corelation in Stat->Corelation i.e
(A) now has significant P-Value after following the above regression steps
(B)?????
Can anyone answer following...
1. Am I following right way of doing regression ? If not, pl. tell me where I am
wrong and how to improve ?
2. Should I consider the final P-Values of a parameter to be significant after
regression even if Stat-> Corelation does'nt identify it as significant initially
?
3. Does regression on a parameter involves condition of normality
for Y in Y =f(x) ?
I would be highly obliged.