**Re: validation of predictive models**
they run a correlation analysis and then from that they arrived the regression models. Actually we did a lot of iterations, because upon seeing the 1st model there are certain factors that the models says will have positive impact but violates human logic (it should be negative not positive). And the most acceptable model is the one that we are currently using

There are some asumptions that have to be made when using regression, if the asumptions are not met can lead to weir conclusions. One of the most common mistake is to take the model as a lineal ones when the data isn't. If you think that the factor defies the logic; it could be noise/non significant for the model and almost sure that the factor is a value near 0.

If you are woried about the effect of the factor is not taken in account the way it should, you can make a chart of the factor against the difference between the real and the forecasted value. If some effect was not taken in account you will see some trend (a curve), showing that the efect was not taken the way it should. IMHO graphical methods show things that the numbers can't, you may have a non lineal trend but an r squared low because the lack of lineality.

from among the 50 factors we need the model to tell us which among the 50 have really the biggest impact to the yield and so we can focus more our action plans on how to control those factors.

As you may know the bigger the absolute of the factor the more impact on yield will have, but the relation will change if the variables that affect such variation are controled, you can have a strong relationship between a variable and another but if the variable is controled the impact will be negligible, it's like seeing a line from a micron, you will only see the noise (in the case of a line on the center, only black).

You ask about the use of R squared to determine wich model is right, if you are comparing a model result against the other IMHO is better to use the methods used in forecasting to determine wich gives a better results (RMSE, MAPE,MdAPE, GMRAE, MdRAE). IMHO Median Relative Absolute Error (MdRAE) is well protected against outliers.