Algorithms for partial least squares (PLS) modelling are placed into a sound theoretical context focusing on numerical precision and computational efficiency. NIPALS and other PLS algorithms that perform deflation steps of the predictors (X) may be slow or even computationally infeasible for sparse and/or large-scale data sets. As alternatives, we develop new versions of the Bidiag1 and Bidiag2 algorithms. These include full reorthogonalization of both score and loading vectors, which we consider to be both necessary and sufficient for numerical precision. Using a collection of benchmark data sets, these 2 new algorithms are compared to the NIPALS PLS and 4 other PLS algorithms acknowledged in the chemometrics literature. The provably stable Householder algorithm for PLS regression is taken as the reference method for numerical precision. Our conclusion is that our new Bidiag1 and Bidiag2 algorithms are themethods of choice for problems where both efficiency and numerical precision are important. When efficiency is not urgent, the NIPALS PLS and the Householder PLS are also good choices. The benchmark study shows that SIMPLS gives poor numerical precision even for a small number of factors. Further, the nonorthogonal scores PLS, direct scores PLS, and the improved kernel PLS are demonstrated to be numerically less stable than the best algorithms. PrototypeMATLAB codes are included for the 5 PLS algorithms concluded to be numerically stable on our benchmark data sets. Other aspects of PLS modelling, such as the evaluation of the regression coefficients, are also analyzed using techniques from numerical linear algebra.
Funding Agencies|Research Council of Norway [239070]