English
Related papers

Related papers: Linear Bandit algorithms using the Bootstrap

200 papers

We propose a new bootstrap-based online algorithm for stochastic linear bandit problems. The key idea is to adopt residual bootstrap exploration, in which the agent estimates the next step reward by re-sampling the residuals of mean reward…

Machine Learning · Statistics 2022-06-20 Shuang Wu , Chi-Hua Wang , Yuantong Li , Guang Cheng

In the stochastic bandit problem, the goal is to maximize an unknown function via a sequence of noisy evaluations. Typically, the observation noise is assumed to be independent of the evaluation point and to satisfy a tail bound uniformly…

Machine Learning · Statistics 2018-04-20 Johannes Kirschner , Andreas Krause

We study the linear stochastic bandit problem, relaxing the standard i.i.d. assumption on the observation noise. As an alternative to this restrictive assumption, we allow the noise terms across rounds to be sub-Gaussian but interdependent,…

Machine Learning · Statistics 2025-05-28 Baptiste Abélès , Eugenio Clerico , Hamish Flynn , Gergely Neu

Motivated by models of human decision making proposed to explain commonly observed deviations from conventional expected value preferences, we formulate two stochastic multi-armed bandit problems with distorted probabilities on the reward…

Machine Learning · Computer Science 2023-11-01 Ravi Kumar Kolla , Prashanth L. A. , Aditya Gopalan , Krishna Jagannathan , Michael Fu , Steve Marcus

Bayesian bandit algorithms with approximate Bayesian inference have been widely used in real-world applications. Despite the superior practical performance, their theoretical justification is less investigated in the literature, especially…

Machine Learning · Statistics 2025-05-23 Ziyi Huang , Henry Lam , Haofeng Zhang

We present improved algorithms with worst-case regret guarantees for the stochastic linear bandit problem. The widely used "optimism in the face of uncertainty" principle reduces a stochastic bandit problem to the construction of a…

Machine Learning · Statistics 2024-09-06 Hamish Flynn , David Reeb , Melih Kandemir , Jan Peters

We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold…

Machine Learning · Computer Science 2017-04-25 Avishek Ghosh , Sayak Ray Chowdhury , Aditya Gopalan

Upper Confidence Bound (UCB) method is arguably the most celebrated one used in online decision making with partial information feedback. Existing techniques for constructing confidence bounds are typically built upon various concentration…

Machine Learning · Statistics 2019-11-01 Botao Hao , Yasin Abbasi-Yadkori , Zheng Wen , Guang Cheng

We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of…

Machine Learning · Computer Science 2022-12-06 Aldo Pacchiano , My Phan , Yasin Abbasi-Yadkori , Anup Rao , Julian Zimmert , Tor Lattimore , Csaba Szepesvari

We provide a simple method to combine stochastic bandit algorithms. Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem that we solve with a…

Machine Learning · Computer Science 2020-12-25 Ashok Cutkosky , Abhimanyu Das , Manish Purohit

This paper is motivated by recent research in the $d$-dimensional stochastic linear bandit literature, which has revealed an unsettling discrepancy: algorithms like Thompson sampling and Greedy demonstrate promising empirical performance,…

Machine Learning · Computer Science 2025-05-20 Yuwei Luo , Mohsen Bayati

We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution.…

Machine Learning · Statistics 2020-05-19 Leonardo Cella , Alessandro Lazaric , Massimiliano Pontil

Bandit algorithms have various application in safety-critical systems, where it is important to respect the system constraints that rely on the bandit's unknown parameters at every round. In this paper, we formulate a linear stochastic…

Machine Learning · Computer Science 2019-08-19 Sanae Amani , Mahnoosh Alizadeh , Christos Thrampoulidis

The safe linear bandit problem is a version of the classical stochastic linear bandit problem where the learner's actions must satisfy an uncertain constraint at all rounds. Due its applicability to many real-world settings, this problem…

Machine Learning · Computer Science 2024-03-13 Spencer Hutchinson , Berkay Turan , Mahnoosh Alizadeh

Recent growing adoption of experimentation in practice has led to a surge of attention to multiarmed bandits as a technique to reduce the opportunity cost of online experiments. In this setting, a decision-maker sequentially chooses among a…

Machine Learning · Computer Science 2022-04-04 Nima Hamidi , Mohsen Bayati

The design and performance analysis of bandit algorithms in the presence of stage-wise safety or reliability constraints has recently garnered significant interest. In this work, we consider the linear stochastic bandit problem under…

Machine Learning · Computer Science 2020-03-03 Ahmadreza Moradipari , Sanae Amani , Mahnoosh Alizadeh , Christos Thrampoulidis

We consider the problem of controlling a known linear dynamical system under stochastic noise, adversarially chosen costs, and bandit feedback. Unlike the full feedback setting where the entire cost function is revealed after each decision,…

Machine Learning · Computer Science 2020-07-03 Asaf Cassel , Tomer Koren

Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by…

The stochastic linear bandit problem proceeds in rounds where at each round the algorithm selects a vector from a decision set after which it receives a noisy linear loss parameterized by an unknown vector. The goal in such a problem is to…

Machine Learning · Statistics 2016-06-21 Nicholas Johnson , Vidyashankar Sivakumar , Arindam Banerjee

Non-stationary parametric bandits have attracted much attention recently. There are three principled ways to deal with non-stationarity, including sliding-window, weighted, and restart strategies. As many non-stationary environments exhibit…

Machine Learning · Computer Science 2023-06-08 Jing Wang , Peng Zhao , Zhi-Hua Zhou
‹ Prev 1 2 3 10 Next ›