Related papers: A Novel Confidence-Based Algorithm for Structured …

Bounded Regret for Finite-Armed Structured Bandits

We study a new type of K-armed bandit problem where the expected return of one arm may depend on the returns of other arms. We present a new algorithm for this general class of problems and show that under certain circumstances it is…

Machine Learning · Computer Science 2014-11-12 Tor Lattimore , Remi Munos

Structure Adaptive Algorithms for Stochastic Bandits

We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods…

Machine Learning · Statistics 2020-07-03 Rémy Degenne , Han Shao , Wouter M. Koolen

Optimally Confident UCB: Improved Regret for Finite-Armed Bandits

I present the first algorithm for stochastic finite-armed bandits that simultaneously enjoys order-optimal problem-dependent regret and worst-case regret. Besides the theoretical results, the new algorithm is simple, efficient and…

Machine Learning · Computer Science 2016-02-25 Tor Lattimore

From Finite to Countable-Armed Bandits

We consider a stochastic bandit problem with countably many arms that belong to a finite set of types, each characterized by a unique mean reward. In addition, there is a fixed distribution over types which sets the proportion of each type…

Machine Learning · Computer Science 2021-05-25 Anand Kalvit , Assaf Zeevi

Thompson Sampling for Bandits with Clustered Arms

We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant with linear expected rewards, in the setting where arms are clustered. We show, both theoretically and…

Machine Learning · Computer Science 2022-06-16 Emil Carlsson , Devdatt Dubhashi , Fredrik D. Johansson

Experimental Design for Semiparametric Bandits

We study finite-armed semiparametric bandits, where each arm's reward combines a linear component with an unknown, potentially adversarial shift. This model strictly generalizes classical linear bandits and reflects complexities common in…

Machine Learning · Statistics 2025-06-18 Seok-Jin Kim , Gi-Soo Kim , Min-hwan Oh

Sparse Stochastic Bandits

In the classical multi-armed bandit problem, d arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales…

Machine Learning · Computer Science 2017-06-06 Joon Kwon , Vianney Perchet , Claire Vernade

Simple regret for infinitely many armed bandits

We consider a stochastic bandit problem with infinitely many arms. In this setting, the learner has no chance of trying all the arms even once and has to dedicate its limited number of samples only to a certain number of arms. All previous…

Machine Learning · Computer Science 2015-05-19 Alexandra Carpentier , Michal Valko

More Adaptive Algorithms for Adversarial Bandits

We develop a novel and generic algorithm for the adversarial multi-armed bandit problem (or more generally the combinatorial semi-bandit problem). When instantiated differently, our algorithm achieves various new data-dependent regret…

Machine Learning · Computer Science 2018-06-08 Chen-Yu Wei , Haipeng Luo

Regret Bounds for Batched Bandits

We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number…

Data Structures and Algorithms · Computer Science 2020-02-19 Hossein Esfandiari , Amin Karbasi , Abbas Mehrabian , Vahab Mirrokni

Bandit Regret Scaling with the Effective Loss Range

We study how the regret guarantees of nonstochastic multi-armed bandits can be improved, if the effective range of the losses in each round is small (e.g. the maximal difference between two losses in a given round). Despite a recent…

Machine Learning · Computer Science 2020-01-03 Nicolò Cesa-Bianchi , Ohad Shamir

Stochastic Bandits Robust to Adversarial Attacks

This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model,…

Machine Learning · Computer Science 2024-08-19 Xuchuang Wang , Jinhang Zuo , Xutong Liu , John C. S. Lui , Mohammad Hajiesmaili

Corralling Stochastic Bandit Algorithms

We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best…

Machine Learning · Computer Science 2021-03-02 Raman Arora , Teodor V. Marinov , Mehryar Mohri

On Slowly-varying Non-stationary Bandits

We consider minimisation of dynamic regret in non-stationary bandits with a slowly varying property. Namely, we assume that arms' rewards are stochastic and independent over time, but that the absolute difference between the expected…

Machine Learning · Computer Science 2021-10-26 Ramakrishnan Krishnamurthy , Aditya Gopalan

Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms

We consider stochastic multi-armed bandits where the expected reward is a unimodal function over partially ordered arms. This important class of problems has been recently investigated in (Cope 2009, Yu 2011). The set of arms is either…

Machine Learning · Computer Science 2014-05-21 Richard Combes , Alexandre Proutiere

Rising Rested Bandits: Lower Bounds and Efficient Algorithms

This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e. those sequential selection techniques able to learn online using only the feedback given by the chosen option (a.k.a. $arm$). We study a particular case of the rested…

Machine Learning · Statistics 2024-11-28 Marco Fiandri , Alberto Maria Metelli , Francesco Trov`o

Factored Bandits

We introduce the factored bandits model, which is a framework for learning with limited (bandit) feedback, where actions can be decomposed into a Cartesian product of atomic actions. Factored bandits incorporate rank-1 bandits as a special…

Machine Learning · Computer Science 2018-10-30 Julian Zimmert , Yevgeny Seldin

Algorithms for Linear Bandits on Polyhedral Sets

We study stochastic linear optimization problem with bandit feedback. The set of arms take values in an $N$-dimensional space and belong to a bounded polyhedron described by finitely many linear inequalities. We provide a lower bound for…

Machine Learning · Computer Science 2015-09-29 Manjesh K. Hanawal , Amir Leshem , Venkatesh Saligrama

Predictive Bandits

We introduce and study a new class of stochastic bandit problems, referred to as predictive bandits. In each round, the decision maker first decides whether to gather information about the rewards of particular arms (so that their rewards…

Machine Learning · Computer Science 2020-04-03 Simon Lindståhl , Alexandre Proutiere , Andreas Johnsson

Online Model Selection: a Rested Bandit Formulation

Motivated by a natural problem in online model selection with bandit information, we introduce and analyze a best arm identification problem in the rested bandit setting, wherein arm expected losses decrease with the number of times the arm…

Machine Learning · Statistics 2020-12-08 Leonardo Cella , Claudio Gentile , Massimiliano Pontil