Related papers: Linear Bandit algorithms using the Bootstrap

Residual Bootstrap Exploration for Stochastic Linear Bandit

We propose a new bootstrap-based online algorithm for stochastic linear bandit problems. The key idea is to adopt residual bootstrap exploration, in which the agent estimates the next step reward by re-sampling the residuals of mean reward…

Machine Learning · Statistics 2022-06-20 Shuang Wu , Chi-Hua Wang , Yuantong Li , Guang Cheng

Information Directed Sampling and Bandits with Heteroscedastic Noise

In the stochastic bandit problem, the goal is to maximize an unknown function via a sequence of noisy evaluations. Typically, the observation noise is assumed to be independent of the evaluation point and to satisfy a tail bound uniformly…

Machine Learning · Statistics 2018-04-20 Johannes Kirschner , Andreas Krause

Linear Bandits with Non-i.i.d. Noise

We study the linear stochastic bandit problem, relaxing the standard i.i.d. assumption on the observation noise. As an alternative to this restrictive assumption, we allow the noise terms across rounds to be sub-Gaussian but interdependent,…

Machine Learning · Statistics 2025-05-28 Baptiste Abélès , Eugenio Clerico , Hamish Flynn , Gergely Neu

Bandit algorithms to emulate human decision making using probabilistic distortions

Motivated by models of human decision making proposed to explain commonly observed deviations from conventional expected value preferences, we formulate two stochastic multi-armed bandit problems with distorted probabilities on the reward…

Machine Learning · Computer Science 2023-11-01 Ravi Kumar Kolla , Prashanth L. A. , Aditya Gopalan , Krishna Jagannathan , Michael Fu , Steve Marcus

Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits

Bayesian bandit algorithms with approximate Bayesian inference have been widely used in real-world applications. Despite the superior practical performance, their theoretical justification is less investigated in the literature, especially…

Machine Learning · Statistics 2025-05-23 Ziyi Huang , Henry Lam , Haofeng Zhang

Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures

We present improved algorithms with worst-case regret guarantees for the stochastic linear bandit problem. The widely used "optimism in the face of uncertainty" principle reduces a stochastic bandit problem to the construction of a…

Machine Learning · Statistics 2024-09-06 Hamish Flynn , David Reeb , Melih Kandemir , Jan Peters

Misspecified Linear Bandits

We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold…

Machine Learning · Computer Science 2017-04-25 Avishek Ghosh , Sayak Ray Chowdhury , Aditya Gopalan

Bootstrapping Upper Confidence Bound

Upper Confidence Bound (UCB) method is arguably the most celebrated one used in online decision making with partial information feedback. Existing techniques for constructing confidence bounds are typically built upon various concentration…

Machine Learning · Statistics 2019-11-01 Botao Hao , Yasin Abbasi-Yadkori , Zheng Wen , Guang Cheng

Model Selection in Contextual Stochastic Bandit Problems

We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of…

Machine Learning · Computer Science 2022-12-06 Aldo Pacchiano , My Phan , Yasin Abbasi-Yadkori , Anup Rao , Julian Zimmert , Tor Lattimore , Csaba Szepesvari

Upper Confidence Bounds for Combining Stochastic Bandits

We provide a simple method to combine stochastic bandit algorithms. Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem that we solve with a…

Machine Learning · Computer Science 2020-12-25 Ashok Cutkosky , Abhimanyu Das , Manish Purohit

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

This paper is motivated by recent research in the $d$-dimensional stochastic linear bandit literature, which has revealed an unsettling discrepancy: algorithms like Thompson sampling and Greedy demonstrate promising empirical performance,…

Machine Learning · Computer Science 2025-05-20 Yuwei Luo , Mohsen Bayati

Meta-learning with Stochastic Linear Bandits

We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution.…

Machine Learning · Statistics 2020-05-19 Leonardo Cella , Alessandro Lazaric , Massimiliano Pontil

Linear Stochastic Bandits Under Safety Constraints

Bandit algorithms have various application in safety-critical systems, where it is important to respect the system constraints that rely on the bandit's unknown parameters at every round. In this paper, we formulate a linear stochastic…

Machine Learning · Computer Science 2019-08-19 Sanae Amani , Mahnoosh Alizadeh , Christos Thrampoulidis

Directional Optimism for Safe Linear Bandits

The safe linear bandit problem is a version of the classical stochastic linear bandit problem where the learner's actions must satisfy an uncertain constraint at all rounds. Due its applicability to many real-world settings, this problem…

Machine Learning · Computer Science 2024-03-13 Spencer Hutchinson , Berkay Turan , Mahnoosh Alizadeh

A General Theory of the Stochastic Linear Bandit and Its Applications

Recent growing adoption of experimentation in practice has led to a surge of attention to multiarmed bandits as a technique to reduce the opportunity cost of online experiments. In this setting, a decision-maker sequentially chooses among a…

Machine Learning · Computer Science 2022-04-04 Nima Hamidi , Mohsen Bayati

Safe Linear Thompson Sampling with Side Information

The design and performance analysis of bandit algorithms in the presence of stage-wise safety or reliability constraints has recently garnered significant interest. In this work, we consider the linear stochastic bandit problem under…

Machine Learning · Computer Science 2020-03-03 Ahmadreza Moradipari , Sanae Amani , Mahnoosh Alizadeh , Christos Thrampoulidis

Bandit Linear Control

We consider the problem of controlling a known linear dynamical system under stochastic noise, adversarially chosen costs, and bandit feedback. Unlike the full feedback setting where the entire cost function is revealed after each decision,…

Machine Learning · Computer Science 2020-07-03 Asaf Cassel , Tomer Koren

Linear Bandits with Stochastic Delayed Feedback

Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by…

Machine Learning · Statistics 2020-03-03 Claire Vernade , Alexandra Carpentier , Tor Lattimore , Giovanni Zappella , Beyza Ermis , Michael Brueckner

Structured Stochastic Linear Bandits

The stochastic linear bandit problem proceeds in rounds where at each round the algorithm selects a vector from a decision set after which it receives a noisy linear loss parameterized by an unknown vector. The goal in such a problem is to…

Machine Learning · Statistics 2016-06-21 Nicholas Johnson , Vidyashankar Sivakumar , Arindam Banerjee

Revisiting Weighted Strategy for Non-stationary Parametric Bandits

Non-stationary parametric bandits have attracted much attention recently. There are three principled ways to deal with non-stationarity, including sliding-window, weighted, and restart strategies. As many non-stationary environments exhibit…

Machine Learning · Computer Science 2023-06-08 Jing Wang , Peng Zhao , Zhi-Hua Zhou