Combating small molecule aggregation with machine learning

Kuan Lee; Ann Yang; Yen-Chu Lin; Daniel Reker; Goncalo J. L. Bernardes; Tiago Rodrigues

Combating small molecule aggregation with machine learning

Quantitative Methods 2021-05-04 v1 Machine Learning

Authors: Kuan Lee , Ann Yang , Yen-Chu Lin , Daniel Reker , Goncalo J. L. Bernardes , Tiago Rodrigues

Abstract

Biological screens are plagued by false positive hits resulting from aggregation. Thus, methods to triage small colloidally aggregating molecules (SCAMs) are in high demand. Herein, we disclose a bespoke machine-learning tool to confidently and intelligibly flag such entities. Our data demonstrate an unprecedented utility of machine learning for predicting SCAMs, achieving 80% of correct predictions in a challenging out-of-sample validation. The tool outperformed a panel of expert chemists, who correctly predicted 61 +/- 7% of the same test molecules in a Turing-like test. Further, the computational routine provided insight into molecular features governing aggregation that had remained hidden to expert intuition. Leveraging our tool, we quantify that up to 15-20% of ligands in publicly available chemogenomic databases have the high potential to aggregate at typical screening concentrations, imposing caution in systems biology and drug design programs. Our approach provides a means to augment human intuition, mitigate attrition and a pathway to accelerate future molecular medicine.

Keywords

machine learning in genomics computational drug discovery gene expression analysis

Cite

@article{arxiv.2105.00267,
  title  = {Combating small molecule aggregation with machine learning},
  author = {Kuan Lee and Ann Yang and Yen-Chu Lin and Daniel Reker and Goncalo J. L. Bernardes and Tiago Rodrigues},
  journal= {arXiv preprint arXiv:2105.00267},
  year   = {2021}
}

Combating small molecule aggregation with machine learning

Abstract

Keywords

Cite

Related papers