Related papers: Algorithm Design and Stronger Guarantees for the I…

Nearly-tight Approximation Guarantees for the Improving Multi-Armed Bandits Problem

We give nearly-tight upper and lower bounds for the improving multi-armed bandits problem. An instance of this problem has $k$ arms, each of whose reward function is a concave and increasing function of the number of times that arm has been…

Machine Learning · Computer Science 2024-04-02 Avrim Blum , Kavya Ravichandran

Optimal Multi-Objective Best Arm Identification with Fixed Confidence

We consider a multi-armed bandit setting with finitely many arms, in which each arm yields an $M$-dimensional vector reward upon selection. We assume that the reward of each dimension (a.k.a. {\em objective}) is generated independently of…

Machine Learning · Computer Science 2025-01-24 Zhirui Chen , P. N. Karthik , Yeow Meng Chee , Vincent Y. F. Tan

Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme

We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private)…

Machine Learning · Statistics 2022-12-06 Kontantinos E. Nikolakakis , Dionysios S. Kalogerias , Or Sheffet , Anand D. Sarwate

Optimal Multi-Fidelity Best-Arm Identification

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to…

Machine Learning · Computer Science 2025-05-27 Riccardo Poiani , Rémy Degenne , Emilie Kaufmann , Alberto Maria Metelli , Marcello Restelli

Optimal Best-arm Identification in Linear Bandits

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits. The objective is to identify the best arm with a given level of certainty while minimizing the sampling budget. We devise a simple algorithm…

Machine Learning · Statistics 2020-06-30 Yassir Jedra , Alexandre Proutiere

Max-Min Grouped Bandits

In this paper, we introduce a multi-armed bandit problem termed max-min grouped bandits, in which the arms are arranged in possibly-overlapping groups, and the goal is to find the group whose worst arm has the highest mean reward. This…

Machine Learning · Statistics 2022-03-16 Zhenlin Wang , Jonathan Scarlett

Practical Algorithms for Best-K Identification in Multi-Armed Bandits

In the Best-$K$ identification problem (Best-$K$-Arm), we are given $N$ stochastic bandit arms with unknown reward distributions. Our goal is to identify the $K$ arms with the largest means with high confidence, by drawing samples from the…

Machine Learning · Computer Science 2017-05-22 Haotian Jiang , Jian Li , Mingda Qiao

Best Arm Identification in Stochastic Bandits: Beyond $\beta-$optimality

This paper investigates a hitherto unaddressed aspect of best arm identification (BAI) in stochastic multi-armed bandits in the fixed-confidence setting. Two key metrics for assessing bandit algorithms are computational efficiency and…

Machine Learning · Statistics 2023-06-26 Arpan Mukherjee , Ali Tajer

The Max $K$-Armed Bandit: A PAC Lower Bound and tighter Algorithms

We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several sources (arms) of items (rewards), and interested in finding the best item overall. At each time step the agent chooses an arm, and obtains a random…

Machine Learning · Statistics 2015-08-25 Yahel David , Nahum Shimkin

On Regret with Multiple Best Arms

We study a regret minimization problem with the existence of multiple best/near-optimal arms in the multi-armed bandit setting. We consider the case when the number of arms/actions is comparable or much larger than the time horizon, and…

Machine Learning · Statistics 2020-10-23 Yinglun Zhu , Robert Nowak

Best Arm Identification in Linked Bandits

We consider the problem of best arm identification in a variant of multi-armed bandits called linked bandits. In a single interaction with linked bandits, multiple arms are played sequentially until one of them receives a positive reward.…

Machine Learning · Computer Science 2019-01-29 Anant Gupta

Pure Exploration in Bandits with Linear Constraints

We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when \emph{the arms are subject to linear constraints}. Unlike the standard best-arm identification problem which is well…

Machine Learning · Computer Science 2024-01-26 Emil Carlsson , Debabrota Basu , Fredrik D. Johansson , Devdatt Dubhashi

Bandits with many optimal arms

We consider a stochastic bandit problem with a possibly infinite number of arms. We write $p^*$ for the proportion of optimal arms and $\Delta$ for the minimal mean-gap between optimal and sub-optimal arms. We characterize the optimal…

Machine Learning · Computer Science 2021-11-08 Rianne de Heide , James Cheshire , Pierre Ménard , Alexandra Carpentier

Statistically Robust, Risk-Averse Best Arm Identification in Multi-Armed Bandits

Traditional multi-armed bandit (MAB) formulations usually make certain assumptions about the underlying arms' distributions, such as bounds on the support or their tail behaviour. Moreover, such parametric information is usually 'baked'…

Machine Learning · Computer Science 2022-03-29 Anmol Kagrecha , Jayakrishnan Nair , Krishna Jagannathan

Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an…

Machine Learning · Statistics 2019-09-13 Alexander Luedtke , Emilie Kaufmann , Antoine Chambaz

The Max $K$-Armed Bandit: PAC Lower Bounds and Efficient Algorithms

We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several stochastic arms, each a source of i.i.d. rewards of unknown distribution. At each time step the agent chooses an arm, and observes the reward of the…

Machine Learning · Statistics 2015-12-25 Yahel David , Nahum Shimkin

Bandit algorithms to emulate human decision making using probabilistic distortions

Motivated by models of human decision making proposed to explain commonly observed deviations from conventional expected value preferences, we formulate two stochastic multi-armed bandit problems with distorted probabilities on the reward…

Machine Learning · Computer Science 2023-11-01 Ravi Kumar Kolla , Prashanth L. A. , Aditya Gopalan , Krishna Jagannathan , Michael Fu , Steve Marcus

Conservative Bandits

We study a novel multi-armed bandit problem that models the challenge faced by a company wishing to explore new strategies to maximize revenue whilst simultaneously maintaining their revenue above a fixed baseline, uniformly over time.…

Machine Learning · Statistics 2016-02-16 Yifan Wu , Roshan Shariff , Tor Lattimore , Csaba Szepesvári

Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents

Motivated by applications such as online labor markets we consider a variant of the stochastic multi-armed bandit problem where we have a collection of arms representing strategic agents with different performance characteristics. The…

Computer Science and Game Theory · Computer Science 2025-03-11 Seyed A. Esmaeili , Suho Shin , Aleksandrs Slivkins

Best-Arm Identification in Linear Bandits

We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter $\theta^*$ and the objective is to return the arm with the largest reward. We characterize the complexity…

Machine Learning · Computer Science 2014-11-05 Marta Soare , Alessandro Lazaric , Rémi Munos