English
Related papers

Related papers: An optimal algorithm for the Thresholding Bandit P…

200 papers

We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when \emph{the arms are subject to linear constraints}. Unlike the standard best-arm identification problem which is well…

Machine Learning · Computer Science 2024-01-26 Emil Carlsson , Debabrota Basu , Fredrik D. Johansson , Devdatt Dubhashi

We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a lower bound for the sample complexity and extend an algorithm designed for Best Arm…

Machine Learning · Statistics 2024-02-16 Eduardo Ochoa Rivera , Ambuj Tewari

For the model of constrained multi-armed bandit, we show that by construction there exists an index-based deterministic asymptotically optimal algorithm. The optimality is achieved by the convergence of the probability of choosing an…

Optimization and Control · Mathematics 2020-07-30 Hyeong Soo Chang

We introduce the model selection problem in pure exploration linear bandits, where the learner needs to adapt to the instance-dependent complexity measure of the smallest hypothesis class containing the true model. We design algorithms in…

Machine Learning · Statistics 2022-03-18 Yinglun Zhu , Julian Katz-Samuels , Robert Nowak

We study here the problem of learning the exploration exploitation trade-off in the contextual bandit problem with linear reward function setting. In the traditional algorithms that solve the contextual bandit problem, the exploration is a…

Machine Learning · Computer Science 2020-05-06 Djallel Bouneffouf , Emmanuelle Claeys

In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are…

Machine Learning · Computer Science 2025-02-04 Adrienne Tuynman , Rémy Degenne

This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. Leveraging ideas from the theory of suprema of empirical processes, we provide an algorithm whose…

Machine Learning · Computer Science 2020-06-23 Julian Katz-Samuels , Lalit Jain , Zohar Karnin , Kevin Jamieson

We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such…

Machine Learning · Statistics 2020-07-03 Rémy Degenne , Pierre Ménard , Xuedong Shang , Michal Valko

In multi-armed bandit problems, the typical goal is to identify the arm with the highest reward. This paper explores a threshold-based bandit problem, aiming to select an arm based on its relation to a prescribed threshold \(\tau \). We…

Machine Learning · Computer Science 2025-09-03 Chanakya Varude , Jay Chaudhary , Siddharth Kaushik , Prasanna Chaporkar

We propose the first fully-adaptive algorithm for pure exploration in linear bandits---the task to find the arm with the largest expected reward, which depends on an unknown parameter linearly. While existing methods partially or entirely…

Machine Learning · Statistics 2017-10-17 Liyuan Xu , Junya Honda , Masashi Sugiyama

We consider the problem of \textit{best arm identification} with a \textit{fixed budget $T$}, in the $K$-armed stochastic bandit setting, with arms distribution defined on $[0,1]$. We prove that any bandit strategy, for at least one bandit…

Machine Learning · Statistics 2016-05-31 Alexandra Carpentier , Andrea Locatelli

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits. The objective is to identify the best arm with a given level of certainty while minimizing the sampling budget. We devise a simple algorithm…

Machine Learning · Statistics 2020-06-30 Yassir Jedra , Alexandre Proutiere

Pure exploration in multi-armed bandits has emerged as an important framework for modeling decision-making and search under uncertainty. In modern applications, however, one is often faced with a tremendously large number of options. Even…

Machine Learning · Computer Science 2022-11-22 Parth K. Thaker , Mohit Malu , Nikhil Rao , Gautam Dasarathy

We consider a constrained, pure exploration, stochastic multi-armed bandit formulation under a fixed budget. Each arm is associated with an unknown, possibly multi-dimensional distribution and is described by multiple attributes that are a…

Machine Learning · Computer Science 2022-11-29 Fathima Zarin Faizal , Jayakrishnan Nair

Decision making under uncertain environments in the maximization of expected reward while minimizing its risk is one of the ubiquitous problems in many subjects. Here, we introduce a novel problem setting in stochastic bandit optimization…

Machine Learning · Computer Science 2025-10-27 Shunta Nonaga , Koji Tabata , Yuta Mizuno , Tamiki Komatsuzaki

We present a new algorithm based on an gradient ascent for a general Active Exploration bandit problem in the fixed confidence setting. This problem encompasses several well studied problems such that the Best Arm Identification or…

Machine Learning · Statistics 2019-05-21 Pierre Ménard

The stochastic contextual bandit problem, which models the trade-off between exploration and exploitation, has many real applications, including recommender systems, online advertising and clinical trials. As many other machine learning…

Machine Learning · Statistics 2022-06-14 Qin Ding , Yue Kang , Yi-Wei Liu , Thomas C. M. Lee , Cho-Jui Hsieh , James Sharpnack

We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the…

Machine Learning · Computer Science 2019-05-28 Chao Tao , Saùl Blanco , Jian Peng , Yuan Zhou

We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given $n$ stochastic arms with unknown reward distributions, as well as a matroid…

Machine Learning · Computer Science 2016-05-26 Lijie Chen , Anupam Gupta , Jian Li

We consider the combinatorial bandits problem with semi-bandit feedback under finite sampling budget constraints, in which the learner can carry out its action only for a limited number of times specified by an overall budget. The action is…

Machine Learning · Computer Science 2022-10-17 Jasmin Brandt , Viktor Bengs , Björn Haddenhorst , Eyke Hüllermeier
‹ Prev 1 2 3 10 Next ›