Ensemble partial least squares regression for descriptor selection, outlier detection, applicability domain assessment, and ensemble modeling in QSAR/QSPR modeling

Skip to Navigation

EarlyView Article

  • Published: Jul 18, 2017
  • Author: Dong‐Sheng Cao, Zhen‐Ke Deng, Min‐Feng Zhu, Zhi‐Jiang Yao, Jie Dong, Rui‐Gang Zhao

In QSAR/QSPR modeling, building an accurate partial least squares (PLS) model usually involves descriptor selection, outlier detection, applicability domain assessment, nonlinear relationship, and model stability problems. In the present study, we presented an ensemble PLS (EnPLS) method for solving these modeling tasks under a unified methodology framework. EnPLS aims at developing a consistent algorithmic framework by means of the idea of ensemble learning and statistical distribution. The approach exploits the fact that the distribution of PLS model coefficients provides a mechanism for ranking and interpreting the effects of variables, whereas the distribution of prediction errors provides a mechanism for differentiating the outliers from normal samples and assessing the applicability domain of models. The use of statistics of these distributions, namely, mean/median value and standard deviation, inherently provides a feasible way to effectively describe the information contained by the original samples. Furthermore, ensemble modeling and prediction based on several cross‐predictive PLS models could effectively improve the model prediction performance and increase the model stability to a certain extent. The aqueous solubility data are used to demonstrate the ability of our proposed EnPLS method in solving various modeling tasks such as descriptor selection, outlier detection, applicability domain assessment, performance improvement, and model stability. Finally, a freely available R package implementing EnPLS is developed to facilitate the use of chemists and pharmacologists. The R package is freely available at https://github.com/wind22zhu/enpls1.2.

Social Links

Share This Links

Bookmark and Share


Suppliers Selection
Societies Selection

Banner Ad

Click here to see
all job opportunities

Most Viewed

Copyright Information

Interested in separation science? Visit our sister site separationsNOW.com

Copyright © 2018 John Wiley & Sons, Inc. All Rights Reserved