English
Related papers

Related papers: Information-Directed Selection for Top-Two Algorit…

200 papers

While experimental design often focuses on selecting the single best alternative from a finite set (e.g., in ranking and selection or best-arm identification), many pure-exploration problems pursue richer goals. Given a specific goal,…

Machine Learning · Statistics 2025-05-28 Chao Qin , Wei You

Top Two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models (Russo, 2016), for parametric families of arms. They select the next arm to sample from by randomizing among two…

Machine Learning · Statistics 2022-10-05 Marc Jourdan , Rémy Degenne , Dorian Baudry , Rianne de Heide , Emilie Kaufmann

In the Best-$K$ identification problem (Best-$K$-Arm), we are given $N$ stochastic bandit arms with unknown reward distributions. Our goal is to identify the $K$ arms with the largest means with high confidence, by drawing samples from the…

Machine Learning · Computer Science 2017-05-22 Haotian Jiang , Jian Li , Mingda Qiao

This paper considers the optimal adaptive allocation of measurement effort for identifying the best among a finite set of options or designs. An experimenter sequentially chooses designs to measure and observes noisy signals of their…

Machine Learning · Computer Science 2018-06-11 Daniel Russo

We introduce a simple and efficient algorithm for stochastic linear bandits with finitely many actions that is asymptotically optimal and (nearly) worst-case optimal in finite time. The approach is based on the frequentist…

Machine Learning · Statistics 2021-07-05 Johannes Kirschner , Tor Lattimore , Claire Vernade , Csaba Szepesvári

We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with multiple performance measures. The goal is to identify the arm that optimizes the…

Machine Learning · Computer Science 2025-01-08 Le Yang , Siyang Gao , Cheng Li , Yi Wang

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits. The objective is to identify the best arm with a given level of certainty while minimizing the sampling budget. We devise a simple algorithm…

Machine Learning · Statistics 2020-06-30 Yassir Jedra , Alexandre Proutiere

Top-$2$ methods have become popular in solving the best arm identification (BAI) problem. The best arm, or the arm with the largest mean amongst finitely many, is identified through an algorithm that at any sequential step independently…

Machine Learning · Computer Science 2024-12-17 Agniv Bandyopadhyay , Sandeep Juneja , Shubhada Agrawal

In the Best-$k$-Arm problem, we are given $n$ stochastic bandit arms, each associated with an unknown reward distribution. We are required to identify the $k$ arms with the largest means by taking as few samples as possible. In this paper,…

Machine Learning · Computer Science 2017-02-15 Lijie Chen , Jian Li , Mingda Qiao

We study best arm identification in a variant of the multi-armed bandit problem where the learner has limited precision in arm selection. The learner can only sample arms via certain exploration bundles, which we refer to as boxes. In…

Machine Learning · Computer Science 2023-05-11 Kota Srinivas Reddy , P. N. Karthik , Nikhil Karamchandani , Jayakrishnan Nair

The improving multi-armed bandits problem is a formal model for allocating effort under uncertainty, motivated by scenarios such as investing research effort into new technologies, performing clinical trials, and hyperparameter selection…

Machine Learning · Computer Science 2026-05-22 Avrim Blum , Marten Garicano , Kavya Ravichandran , Dravyansh Sharma

This paper studies the fixed-confidence best arm identification (BAI) problem in the bandit framework in the canonical single-parameter exponential models. For this problem, many policies have been proposed, but most of them require solving…

Machine Learning · Statistics 2025-08-12 Jongyeong Lee , Junya Honda , Masashi Sugiyama

We study the fixed-confidence best-arm identification problem in unimodal bandits, in which the means of the arms increase with the index of the arm up to their maximum, then decrease. We derive two lower bounds on the stopping time of any…

Machine Learning · Computer Science 2025-05-27 Riccardo Poiani , Marc Jourdan , Emilie Kaufmann , Rémy Degenne

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to…

Machine Learning · Computer Science 2025-05-27 Riccardo Poiani , Rémy Degenne , Emilie Kaufmann , Alberto Maria Metelli , Marcello Restelli

Information-directed sampling (IDS) is a powerful framework for solving bandit problems which has shown strong results in both Bayesian and frequentist settings. However, frequentist IDS, like many other bandit algorithms, requires that one…

Machine Learning · Statistics 2025-03-10 Piotr M. Suder , Eric Laber

We study best-arm identification in stochastic multi-armed bandits under the fixed-confidence setting, focusing on instances with multiple optimal arms. Unlike prior work that addresses the unknown-number-of-optimal-arms case, we consider…

Machine Learning · Computer Science 2026-03-05 Lan V. Truong

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an {\epsilon}-good arm, best-arm identification, top-k arm identification, and…

Machine Learning · Statistics 2020-09-14 Blake Mason , Lalit Jain , Ardhendu Tripathy , Robert Nowak

In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment. While best-arm identification for linear bandits has been extensively studied in recent years, few works have been…

Machine Learning · Statistics 2022-06-10 Marc Jourdan , Rémy Degenne

We consider a Bayesian budgeted multi-armed bandit problem, in which each arm consumes a different amount of resources when selected and there is a budget constraint on the total amount of resources that can be used. Budgeted Thompson…

Machine Learning · Computer Science 2024-08-29 Woojin Jeong , Seungki Min

During online decision making in Multi-Armed Bandits (MAB), one needs to conduct inference on the true mean reward of each arm based on data collected so far at each step. However, since the arms are adaptively selected--thereby yielding…

Machine Learning · Computer Science 2021-06-29 Maria Dimakopoulou , Zhimei Ren , Zhengyuan Zhou
‹ Prev 1 2 3 10 Next ›