Related papers: Split Optimization for Protein/Ligand Binding Mode…
Undetected overfitting can occur when there are significant redundancies between training and validation data. We describe AVE, a new measure of training-validation redundancy for ligand-based classification problems that accounts for the…
Accurate prediction of protein-ligand binding affinities is crucial for drug development. Recent advances in machine learning show promising results on this task. However, these methods typically rely heavily on labeled data, which can be…
Protein-ligand complex structures have been utilised to design benchmark machine learning methods that perform important tasks related to drug design such as receptor binding site detection, small molecule docking and binding affinity…
Accurate prediction of protein-ligand binding affinity is crucial for rapid and efficient drug development. Recently, the importance of predicting binding affinity has led to increased attention on research that models the three-dimensional…
The accurate screening of candidate drug ligands against target proteins through computational approaches is of prime interest to drug development efforts. Such virtual screening depends in part on methods to predict the binding affinity…
Drug-target binding affinity prediction is a fundamental task for drug discovery. It has been extensively explored in literature and promising results are reported. However, in this paper, we demonstrate that the results may be misleading…
Algorithmic bias is of increasing concern, both to the research community, and society at large. Bias in AI is more abstract and unintuitive than traditional forms of discrimination and can be more difficult to detect and mitigate. A clear…
Protein-ligand binding prediction is a fundamental problem in AI-driven drug discovery. Prior work focused on supervised learning methods using a large set of binding affinity data for small molecules, but it is hard to apply the same…
Structure based ligand discovery is one of the most successful approaches for augmenting the drug discovery process. Currently, there is a notable shift towards machine learning (ML) methodologies to aid such procedures. Deep learning has…
Protein-ligand binding is the process by which a small molecule (drug or inhibitor) attaches to a target protein. Binding affinity, which characterizes the strength of biomolecular interactions, is essential for tackling diverse challenges…
Predicting how a drug-like molecule binds to a specific protein target is a core problem in drug discovery. An extremely fast computational binding method would enable key applications such as fast virtual screening or drug engineering.…
Active learning is a powerful tool when labelling data is expensive, but it introduces a bias because the training data no longer follows the population distribution. We formalize this bias and investigate the situations in which it can be…
Classifiers are biased when trained on biased datasets. As a remedy, we propose Learning to Split (ls), an algorithm for automatic bias detection. Given a dataset with input-label pairs, ls learns to split this dataset so that predictors…
Over the last years, word and sentence embeddings have established as text preprocessing for all kinds of NLP tasks and improved the performances significantly. Unfortunately, it has also been shown that these embeddings inherit various…
We investigate the finite sample performance of sample splitting, cross-fitting and averaging for the estimation of the conditional average treatment effect. Recently proposed methods, so-called meta-learners, make use of machine learning…
Body shape plays an important role in determining what garments will best suit a given person, yet today's clothing recommendation methods take a "one shape fits all" approach. These body-agnostic vision methods and datasets are a barrier…
Automatic Differentiation Variational Inference (ADVI) is a useful tool for efficiently learning probabilistic models in machine learning. Generally approximate posteriors learned by ADVI are forced to be unimodal in order to facilitate use…
Given a supervised machine learning problem where the training set has been subject to a known sampling bias, how can a model be trained to fit the original dataset? We achieve this through the Bayesian inference framework by altering the…
Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the…
Deep neural networks have achieved state of the art accuracy at classifying molecules with respect to whether they bind to specific protein targets. A key breakthrough would occur if these models could reveal the fragment pharmacophores…