Adaptive Sampling for Discovery
Abstract
In this paper, we study a sequential decision-making problem, called Adaptive Sampling for Discovery (ASD). Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses. This problem has wide applications to real-world discovery problems, for example drug discovery with the help of machine learning models. ASD algorithms face the well-known exploration-exploitation dilemma. The algorithm needs to choose points that yield information to improve model estimates but it also needs to exploit the model. We rigorously formulate the problem and propose a general information-directed sampling (IDS) algorithm. We provide theoretical guarantees for the performance of IDS in linear, graph and low-rank models. The benefits of IDS are shown in both simulation experiments and real-data experiments for discovering chemical reaction conditions.
Cite
@article{arxiv.2205.14829,
title = {Adaptive Sampling for Discovery},
author = {Ziping Xu and Eunjae Shim and Ambuj Tewari and Paul Zimmerman},
journal= {arXiv preprint arXiv:2205.14829},
year = {2023}
}