Related papers: Note on Thompson sampling for large decision probl…

A Tutorial on Thompson Sampling

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information…

Machine Learning · Computer Science 2020-07-16 Daniel Russo , Benjamin Van Roy , Abbas Kazerouni , Ian Osband , Zheng Wen

Ensemble Sampling

Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for…

Machine Learning · Statistics 2023-04-26 Xiuyuan Lu , Benjamin Van Roy

Thompson Sampling with Virtual Helping Agents

We address the problem of online sequential decision making, i.e., balancing the trade-off between exploiting the current knowledge to maximize immediate performance and exploring the new information to gain long-term benefits using the…

Machine Learning · Computer Science 2022-09-20 Kartik Anand Pant , Amod Hegde , K. V. Srinivas

An Information-Theoretic Analysis of Thompson Sampling

We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization problems in which a decision-maker must learn from partial feedback. This analysis inherits the simplicity and…

Machine Learning · Computer Science 2015-06-09 Daniel Russo , Benjamin Van Roy

Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference

Recently, it has been shown how sampling actions from the predictive distribution over the optimal action-sometimes called Thompson sampling-can be applied to solve sequential adaptive control problems, when the optimal policy is known for…

Artificial Intelligence · Computer Science 2014-09-24 Pedro A. Ortega , Daniel A. Braun

Buying Data Over Time: Approximately Optimal Strategies for Dynamic Data-Driven Decisions

We consider a model where an agent has a repeated decision to make and wishes to maximize their total payoff. Payoffs are influenced by an action taken by the agent, but also an unknown state of the world that evolves over time. Before…

Computer Science and Game Theory · Computer Science 2021-01-20 Nicole Immorlica , Ian Kash , Brendan Lucier

Parallelizing Thompson Sampling

How can we make use of information parallelism in online decision making problems while efficiently balancing the exploration-exploitation trade-off? In this paper, we introduce a batch Thompson Sampling framework for two canonical online…

Machine Learning · Computer Science 2021-06-04 Amin Karbasi , Vahab Mirrokni , Mohammad Shadravan

Adaptive Model Selection Framework: An Application to Airline Pricing

Multiple machine learning and prediction models are often used for the same prediction or recommendation task. In our recent work, where we develop and deploy airline ancillary pricing models in an online setting, we found that among…

Machine Learning · Computer Science 2019-05-23 Naman Shukla , Arinbjörn Kolbeinsson , Lavanya Marla , Kartik Yellepeddi

Data assimilation and online optimization with performance guarantees

This paper considers a class of real-time stochastic optimization problems dependent on an unknown probability distribution. In the considered scenario, data is streaming frequently while trying to reach a decision. Thus, we aim to devise a…

Optimization and Control · Mathematics 2020-09-08 Dan Li , Sonia Martinez

Online data assimilation in distributionally robust optimization

This paper considers a class of real-time decision making problems to minimize the expected value of a function that depends on a random variable $\xi$ under an unknown distribution $\mathbb{P}$. In this process, samples of $\xi$ are…

Optimization and Control · Mathematics 2020-09-08 Dan Li , Sonia Martinez

Online Learning of Decision Trees with Thompson Sampling

Decision Trees are prominent prediction models for interpretable Machine Learning. They have been thoroughly researched, mostly in the batch setting with a fixed labelled dataset, leading to popular algorithms such as C4.5, ID3 and CART.…

Machine Learning · Computer Science 2024-06-24 Ayman Chaouki , Jesse Read , Albert Bifet

Stochastic Choice and Optimal Sequential Sampling

We model the joint distribution of choice probabilities and decision times in binary choice tasks as the solution to a problem of optimal sequential sampling, where the agent is uncertain of the utility of each action and pays a constant…

Neurons and Cognition · Quantitative Biology 2015-05-14 Drew Fudenberg , Philipp Strack , Tomasz Strzalecki

Top-Two Thompson Sampling for Contextual Top-mc Selection Problems

We aim to efficiently allocate a fixed simulation budget to identify the top-mc designs for each context among a finite number of contexts. The performance of each design under a context is measured by an identifiable statistical…

Methodology · Statistics 2023-07-03 Xinbo Shi , Yijie Peng , Gongbo Zhang

Analysis of Thompson Sampling for Controlling Unknown Linear Diffusion Processes

Linear diffusion processes serve as canonical continuous-time models for dynamic decision-making under uncertainty. These systems evolve according to drift matrices that specify the instantaneous rates of change in the expected system…

Machine Learning · Computer Science 2025-06-10 Mohamad Kazem Shirani Faradonbeh , Sadegh Shirani , Mohsen Bayati

Addressing Missing Data Issue for Diffusion-based Recommendation

Diffusion models have shown significant potential in generating oracle items that best match user preference with guidance from user historical interaction sequences. However, the quality of guidance is often compromised by unpredictable…

Information Retrieval · Computer Science 2025-05-20 Wenyu Mao , Zhengyi Yang , Jiancan Wu , Haozhe Liu , Yancheng Yuan , Xiang Wang , Xiangnan He

Thompson Sampling for a Fatigue-aware Online Recommendation System

In this paper we consider an online recommendation setting, where a platform recommends a sequence of items to its users at every time period. The users respond by selecting one of the items recommended or abandon the platform due to…

Machine Learning · Computer Science 2019-04-16 Yunjuan Wang , Theja Tulabandhula

Ranking and Selection with Simultaneous Input Data Collection

In this paper, we propose a general and novel formulation of ranking and selection with the existence of streaming input data. The collection of multiple streams of such data may consume different types of resources, and hence can be…

Machine Learning · Statistics 2025-03-18 Yuhao Wang , Enlu Zhou

Thompson sampling for improved exploration in GFlowNets

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy. Unlike other…

Machine Learning · Computer Science 2023-07-03 Jarrid Rector-Brooks , Kanika Madan , Moksh Jain , Maksym Korablyov , Cheng-Hao Liu , Sarath Chandar , Nikolay Malkin , Yoshua Bengio

Thompson Sampling for Repeated Newsvendor

In this paper, we investigate the performance of Thompson Sampling (TS) for online learning with censored feedback, focusing primarily on the classic repeated newsvendor model--a foundational framework in inventory management--and…

Machine Learning · Computer Science 2026-01-19 Li Chen , Hanzhang Qin , Yunbei Xu , Ruihao Zhu , Weizhou Zhang

Scalable Policy Maximization Under Network Interference

Many interventions, such as vaccines in clinical trials or coupons in online marketplaces, must be assigned sequentially without full knowledge of their effects. Multi-armed bandit algorithms have proven successful in such settings.…

Machine Learning · Statistics 2026-05-07 Aidan Gleich , Eric Laber , Alexander Volfovsky