English

Bandit Data-Driven Optimization

Machine Learning 2022-01-19 v2 Artificial Intelligence Computers and Society Machine Learning

Abstract

Applications of machine learning in the non-profit and public sectors often feature an iterative workflow of data acquisition, prediction, and optimization of interventions. There are four major pain points that a machine learning pipeline must overcome in order to be actually useful in these settings: small data, data collected only under the default intervention, unmodeled objectives due to communication gap, and unforeseen consequences of the intervention. In this paper, we introduce bandit data-driven optimization, the first iterative prediction-prescription framework to address these pain points. Bandit data-driven optimization combines the advantages of online bandit learning and offline predictive analytics in an integrated framework. We propose PROOF, a novel algorithm for this framework and formally prove that it has no-regret. Using numerical simulations, we show that PROOF achieves superior performance than existing baseline. We also apply PROOF in a detailed case study of food rescue volunteer recommendation, and show that PROOF as a framework works well with the intricacies of ML models in real-world AI for non-profit and public sector applications.

Keywords

Cite

@article{arxiv.2008.11707,
  title  = {Bandit Data-Driven Optimization},
  author = {Zheyuan Ryan Shi and Zhiwei Steven Wu and Rayid Ghani and Fei Fang},
  journal= {arXiv preprint arXiv:2008.11707},
  year   = {2022}
}

Comments

This is the complete version of the paper. A version of this paper is also published at AAAI-22

R2 v1 2026-06-23T18:07:24.699Z