Ridge Regression and PLS Regression
Agnar Höskuldsson

Abstract
A brief review of Ridge Regression (RR) and PLS Regression (PLS) is presented. Process and Spectral data are used in the analysis. Both are low-rank data, which is common in chemometric work. The Ridge constant k is determined by minimizing the size of the residuals in Leave-one-out RR. When RR and PLS are applied to the data, we find that RR gives almost the same fit and cross-validation as PLS. However, in applications to test sets the picture is unclear; the results depend both on the test sets and how the analysis is carried out. When comparing RR to PLS, we use from the average of 20 cross-validations. Efficient variable deletion/selection procedures are presented that are based on average cross-validation. When RR and PLS are applied, we find a very small and insignificant difference between RR and PLS. It is shown that RR amounts to adding small ‘noise’ values to X. The Ordinary Least Squares (OLS) solution for the modified X gives the same solution (regression coefficients) as RR. The theory of OLS shows that the RR estimates of the variance of the regression coefficients are too small. It means that we cannot apply the theory of RR in, for instance, analyzing the parameter estimates. RR can be carried out by the same algorithm as PLS. This can be used to show that the same graphic methods can be used at RR as those that are popular in chemometrics. The algorithm can be used to estimate the appropriate dimension to use at RR in a similar way as for PLS. In conclusion, we don’t find a significant difference between RR and PLS, when they are applied to the two types of data. However, the theory of RR does not apply to the analysis of parameter estimates.

Full Text: PDF     DOI: 10.15640/arms.v11n1a1