Related papers: An optimal algorithm for the Thresholding Bandit P…

Pure Exploration in Bandits with Linear Constraints

We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when \emph{the arms are subject to linear constraints}. Unlike the standard best-arm identification problem which is well…

Machine Learning · Computer Science 2024-01-26 Emil Carlsson , Debabrota Basu , Fredrik D. Johansson , Devdatt Dubhashi

Optimal Thresholding Linear Bandit

We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a lower bound for the sample complexity and extend an algorithm designed for Best Arm…

Machine Learning · Statistics 2024-02-16 Eduardo Ochoa Rivera , Ambuj Tewari

An Index-based Deterministic Asymptotically Optimal Algorithm for Constrained Multi-armed Bandit Problems

For the model of constrained multi-armed bandit, we show that by construction there exists an index-based deterministic asymptotically optimal algorithm. The optimality is achieved by the convergence of the probability of choosing an…

Optimization and Control · Mathematics 2020-07-30 Hyeong Soo Chang

Near Instance Optimal Model Selection for Pure Exploration Linear Bandits

We introduce the model selection problem in pure exploration linear bandits, where the learner needs to adapt to the instance-dependent complexity measure of the smallest hypothesis class containing the true model. We design algorithms in…

Machine Learning · Statistics 2022-03-18 Yinglun Zhu , Julian Katz-Samuels , Robert Nowak

Hyper-parameter Tuning for the Contextual Bandit

We study here the problem of learning the exploration exploitation trade-off in the contextual bandit problem with linear reward function setting. In the traditional algorithms that solve the contextual bandit problem, the exploration is a…

Machine Learning · Computer Science 2020-05-06 Djallel Bouneffouf , Emmanuelle Claeys

The Batch Complexity of Bandit Pure Exploration

In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are…

Machine Learning · Computer Science 2025-02-04 Adrienne Tuynman , Rémy Degenne

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. Leveraging ideas from the theory of suprema of empirical processes, we provide an algorithm whose…

Machine Learning · Computer Science 2020-06-23 Julian Katz-Samuels , Lalit Jain , Zohar Karnin , Kevin Jamieson

Gamification of Pure Exploration for Linear Bandits

We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such…

Machine Learning · Statistics 2020-07-03 Rémy Degenne , Pierre Ménard , Xuedong Shang , Michal Valko

Threshold-Based Optimal Arm Selection in Monotonic Bandits: Regret Lower Bounds and Algorithms

In multi-armed bandit problems, the typical goal is to identify the arm with the highest reward. This paper explores a threshold-based bandit problem, aiming to select an arm based on its relation to a prescribed threshold $\tau $. We…

Machine Learning · Computer Science 2025-09-03 Chanakya Varude , Jay Chaudhary , Siddharth Kaushik , Prasanna Chaporkar

Fully adaptive algorithm for pure exploration in linear bandits

We propose the first fully-adaptive algorithm for pure exploration in linear bandits---the task to find the arm with the largest expected reward, which depends on an unknown parameter linearly. While existing methods partially or entirely…

Machine Learning · Statistics 2017-10-17 Liyuan Xu , Junya Honda , Masashi Sugiyama

Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem

We consider the problem of \textit{best arm identification} with a \textit{fixed budget $T$}, in the $K$-armed stochastic bandit setting, with arms distribution defined on $[0,1]$. We prove that any bandit strategy, for at least one bandit…

Machine Learning · Statistics 2016-05-31 Alexandra Carpentier , Andrea Locatelli

Optimal Best-arm Identification in Linear Bandits

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits. The objective is to identify the best arm with a given level of certainty while minimizing the sampling budget. We devise a simple algorithm…

Machine Learning · Statistics 2020-06-30 Yassir Jedra , Alexandre Proutiere

Maximizing and Satisficing in Multi-armed Bandits with Graph Information

Pure exploration in multi-armed bandits has emerged as an important framework for modeling decision-making and search under uncertainty. In modern applications, however, one is often faced with a tremendously large number of options. Even…

Machine Learning · Computer Science 2022-11-22 Parth K. Thaker , Mohit Malu , Nikhil Rao , Gautam Dasarathy

Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget

We consider a constrained, pure exploration, stochastic multi-armed bandit formulation under a fixed budget. Each arm is associated with an unknown, possibly multi-dimensional distribution and is described by multiple attributes that are a…

Machine Learning · Computer Science 2022-11-29 Fathima Zarin Faizal , Jayakrishnan Nair

Risk-Averse Best Arm Set Identification with Fixed Budget and Fixed Confidence

Decision making under uncertain environments in the maximization of expected reward while minimizing its risk is one of the ubiquitous problems in many subjects. Here, we introduce a novel problem setting in stochastic bandit optimization…

Machine Learning · Computer Science 2025-10-27 Shunta Nonaga , Koji Tabata , Yuta Mizuno , Tamiki Komatsuzaki

Gradient Ascent for Active Exploration in Bandit Problems

We present a new algorithm based on an gradient ascent for a general Active Exploration bandit problem in the fixed confidence setting. This problem encompasses several well studied problems such that the Best Arm Identification or…

Machine Learning · Statistics 2019-05-21 Pierre Ménard

Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms

The stochastic contextual bandit problem, which models the trade-off between exploration and exploitation, has many real applications, including recommender systems, online advertising and clinical trials. As many other machine learning…

Machine Learning · Statistics 2022-06-14 Qin Ding , Yue Kang , Yi-Wei Liu , Thomas C. M. Lee , Cho-Jui Hsieh , James Sharpnack

Thresholding Bandit with Optimal Aggregate Regret

We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the…

Machine Learning · Computer Science 2019-05-28 Chao Tao , Saùl Blanco , Jian Peng , Yuan Zhou

Pure Exploration of Multi-armed Bandit Under Matroid Constraints

We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given $n$ stochastic arms with unknown reward distributions, as well as a matroid…

Machine Learning · Computer Science 2016-05-26 Lijie Chen , Anupam Gupta , Jian Li

Finding Optimal Arms in Non-stochastic Combinatorial Bandits with Semi-bandit Feedback and Finite Budget

We consider the combinatorial bandits problem with semi-bandit feedback under finite sampling budget constraints, in which the learner can carry out its action only for a limited number of times specified by an overall budget. The action is…

Machine Learning · Computer Science 2022-10-17 Jasmin Brandt , Viktor Bengs , Björn Haddenhorst , Eyke Hüllermeier