Related papers: Non-stochastic Best Arm Identification and Hyperpa…

Optimal Best-arm Identification in Linear Bandits

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits. The objective is to identify the best arm with a given level of certainty while minimizing the sampling budget. We devise a simple algorithm…

Machine Learning · Statistics 2020-06-30 Yassir Jedra , Alexandre Proutiere

Bandits with many optimal arms

We consider a stochastic bandit problem with a possibly infinite number of arms. We write $p^*$ for the proportion of optimal arms and $\Delta$ for the minimal mean-gap between optimal and sub-optimal arms. We characterize the optimal…

Machine Learning · Computer Science 2021-11-08 Rianne de Heide , James Cheshire , Pierre Ménard , Alexandra Carpentier

Best Arm Identification with Safety Constraints

The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while…

Machine Learning · Computer Science 2021-11-25 Zhenlin Wang , Andrew Wagenmaker , Kevin Jamieson

Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

We study best-arm identification in stochastic multi-armed bandits under the fixed-confidence setting, focusing on instances with multiple optimal arms. Unlike prior work that addresses the unknown-number-of-optimal-arms case, we consider…

Machine Learning · Computer Science 2026-03-05 Lan V. Truong

Best Arm Identification in Stochastic Bandits: Beyond $\beta-$optimality

This paper investigates a hitherto unaddressed aspect of best arm identification (BAI) in stochastic multi-armed bandits in the fixed-confidence setting. Two key metrics for assessing bandit algorithms are computational efficiency and…

Machine Learning · Statistics 2023-06-26 Arpan Mukherjee , Ali Tajer

On Regret with Multiple Best Arms

We study a regret minimization problem with the existence of multiple best/near-optimal arms in the multi-armed bandit setting. We consider the case when the number of arms/actions is comparable or much larger than the time horizon, and…

Machine Learning · Statistics 2020-10-23 Yinglun Zhu , Robert Nowak

Practical Algorithms for Best-K Identification in Multi-Armed Bandits

In the Best-$K$ identification problem (Best-$K$-Arm), we are given $N$ stochastic bandit arms with unknown reward distributions. Our goal is to identify the $K$ arms with the largest means with high confidence, by drawing samples from the…

Machine Learning · Computer Science 2017-05-22 Haotian Jiang , Jian Li , Mingda Qiao

Optimal Multi-Objective Best Arm Identification with Fixed Confidence

We consider a multi-armed bandit setting with finitely many arms, in which each arm yields an $M$-dimensional vector reward upon selection. We assume that the reward of each dimension (a.k.a. {\em objective}) is generated independently of…

Machine Learning · Computer Science 2025-01-24 Zhirui Chen , P. N. Karthik , Yeow Meng Chee , Vincent Y. F. Tan

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

This paper investigates the problem of best arm identification in $\textit{contaminated}$ stochastic multi-arm bandits. In this setting, the rewards obtained from any arm are replaced by samples from an adversarial model with probability…

Machine Learning · Computer Science 2021-11-16 Arpan Mukherjee , Ali Tajer , Pin-Yu Chen , Payel Das

Simple regret for infinitely many armed bandits

We consider a stochastic bandit problem with infinitely many arms. In this setting, the learner has no chance of trying all the arms even once and has to dedicate its limited number of samples only to a certain number of arms. All previous…

Machine Learning · Computer Science 2015-05-19 Alexandra Carpentier , Michal Valko

Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme

We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private)…

Machine Learning · Statistics 2022-12-06 Kontantinos E. Nikolakakis , Dionysios S. Kalogerias , Or Sheffet , Anand D. Sarwate

Best Arm Identification with Minimal Regret

Motivated by real-world applications that necessitate responsible experimentation, we introduce the problem of best arm identification (BAI) with minimal regret. This innovative variant of the multi-armed bandit problem elegantly…

Machine Learning · Computer Science 2024-09-30 Junwen Yang , Vincent Y. F. Tan , Tianyuan Jin

Constrained regret minimization for multi-criterion multi-armed bandits

We consider a stochastic multi-armed bandit setting and study the problem of constrained regret minimization over a given time horizon. Each arm is associated with an unknown, possibly multi-dimensional distribution, and the merit of an arm…

Machine Learning · Computer Science 2023-01-05 Anmol Kagrecha , Jayakrishnan Nair , Krishna Jagannathan

Optimal Multi-Fidelity Best-Arm Identification

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to…

Machine Learning · Computer Science 2025-05-27 Riccardo Poiani , Rémy Degenne , Emilie Kaufmann , Alberto Maria Metelli , Marcello Restelli

A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits

We study the non-stationary stochastic multi-armed bandit problem, where the reward statistics of each arm may change several times during the course of learning. The performance of a learning algorithm is evaluated in terms of their…

Machine Learning · Computer Science 2022-03-09 Yasin Abbasi-Yadkori , Andras Gyorgy , Nevena Lazic

Adaptive Multiple-Arm Identification

We study the problem of selecting $K$ arms with the highest expected rewards in a stochastic $n$-armed bandit game. This problem has a wide range of applications, e.g., A/B testing, crowdsourcing, simulation optimization. Our goal is to…

Machine Learning · Computer Science 2017-06-06 Jiecao Chen , Xi Chen , Qin Zhang , Yuan Zhou

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning. Whereas the achievable limit in terms of regret minimization is now well known, our aim is…

Machine Learning · Statistics 2016-11-15 Emilie Kaufmann , Olivier Cappé , Aurélien Garivier

Beyond the Lower Bound: Bridging Regret Minimization and Best Arm Identification in Lexicographic Bandits

In multi-objective decision-making with hierarchical preferences, lexicographic bandits provide a natural framework for optimizing multiple objectives in a prioritized order. In this setting, a learner repeatedly selects arms and observes…

Machine Learning · Computer Science 2025-11-11 Bo Xue , Yuanyu Wan , Zhichao Lu , Qingfu Zhang

Quick Best Action Identification in Linear Bandit Problems

In this paper, we consider a best action identification problem in the stochastic linear bandit setup with a fixed confident constraint. In the considered best action identification problem, instead of minimizing the accumulative regret as…

Machine Learning · Computer Science 2018-12-04 Jun Geng , Lifeng Lai

Best of both worlds: Stochastic & adversarial best-arm identification

We study bandit best-arm identification with arbitrary and potentially adversarial rewards. A simple random uniform learner obtains the optimal rate of error in the adversarial scenario. However, this type of strategy is suboptimal when the…

Machine Learning · Statistics 2026-04-17 Yasin Abbasi-Yadkori , Peter L. Bartlett , Victor Gabillon , Alan Malek , Michal Valko