English
Related papers

Related papers: A Bad Arm Existence Checking Problem

200 papers

We study the problem of identifying the top $m$ arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This…

Machine Learning · Computer Science 2012-05-16 Sébastien Bubeck , Tengyao Wang , Nitin Viswanathan

We address the M-best-arm identification problem in multi-armed bandits. A player has a limited budget to explore K arms (M<K), and once pulled, each arm yields a reward drawn (independently) from a fixed, unknown distribution. The goal is…

Machine Learning · Statistics 2017-07-11 Shahin Shahrampour , Vahid Tarokh

In the Best-$K$ identification problem (Best-$K$-Arm), we are given $N$ stochastic bandit arms with unknown reward distributions. Our goal is to identify the $K$ arms with the largest means with high confidence, by drawing samples from the…

Machine Learning · Computer Science 2017-05-22 Haotian Jiang , Jian Li , Mingda Qiao

We consider a novel stochastic multi-armed bandit problem called {\em good arm identification} (GAI), where a good arm is defined as an arm with expected reward greater than or equal to a given threshold. GAI is a pure-exploration problem…

Best arm identification (BAI) aims to identify the highest-performance arm among a set of $K$ arms by collecting stochastic samples from each arm. In real-world problems, the best arm needs to satisfy additional feasibility constraints.…

Machine Learning · Computer Science 2026-01-26 Ting Cai , Kirthevasan Kandasamy

We consider the best-arm identification problem in multi-armed bandits, which focuses purely on exploration. A player is given a fixed budget to explore a finite set of arms, and the rewards of each arm are drawn independently from a fixed,…

Machine Learning · Statistics 2017-08-02 Shahin Shahrampour , Mohammad Noshad , Vahid Tarokh

The problem of detecting an odd arm from a set of K arms of a multi-armed bandit, with fixed confidence, is studied in a sequential decision-making scenario. Each arm's signal follows a distribution from a vector exponential family. All…

Information Theory · Computer Science 2022-06-14 Gayathri R Prabhu , Srikrishna Bhashyam , Aditya Gopalan , Rajesh Sundaresan

This paper presents an efficient algorithm to solve the sleeping bandit with multiple plays problem in the context of an online recommendation system. The problem involves bounded, adversarial loss and unknown i.i.d. distributions for arm…

Machine Learning · Computer Science 2023-07-28 Jianjun Yuan , Wei Lee Woon , Ludovik Coba

We consider the best arm identification (BAI) problem in the $K-$armed bandit framework with a modification - the agent is allowed to play a subset of arms at each time slot instead of one arm. Consequently, the agent observes the sample…

Machine Learning · Computer Science 2026-01-30 Siddhartha Parupudi , Gourab Ghatak

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an {\epsilon}-good arm, best-arm identification, top-k arm identification, and…

Machine Learning · Statistics 2020-09-14 Blake Mason , Lalit Jain , Ardhendu Tripathy , Robert Nowak

The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while…

Machine Learning · Computer Science 2021-11-25 Zhenlin Wang , Andrew Wagenmaker , Kevin Jamieson

The 1-identification problem is a fundamental pure-exploration problem in multi-armed bandits. An agent aims to determine whether there exists an arm whose mean reward exceeds a known threshold $\mu_0$, or to output \textsf{None} otherwise.…

Machine Learning · Computer Science 2026-05-15 Zitian Li , Wang Chi Cheung

We study best arm identification in a variant of the multi-armed bandit problem where the learner has limited precision in arm selection. The learner can only sample arms via certain exploration bundles, which we refer to as boxes. In…

Machine Learning · Computer Science 2023-05-11 Kota Srinivas Reddy , P. N. Karthik , Nikhil Karamchandani , Jayakrishnan Nair

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits. The objective is to identify the best arm with a given level of certainty while minimizing the sampling budget. We devise a simple algorithm…

Machine Learning · Statistics 2020-06-30 Yassir Jedra , Alexandre Proutiere

Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with $K$ arms of dimension $d$. In each arm pull, the player receives a reward that is the sum of the dot…

Machine Learning · Computer Science 2025-07-29 Asaf Cohen , Onur Günlü

We study the problem of best arm identification with a fairness constraint in a given causal model. The goal is to find a soft intervention on a given node to maximize the outcome while meeting a fairness constraint by counterfactual…

Computers and Society · Computer Science 2021-11-09 Ruijiang Gao , Han Feng

Recently multi-armed bandit problem arises in many real-life scenarios where arms must be sampled in batches, due to limited time the agent can wait for the feedback. Such applications include biological experimentation and online…

Machine Learning · Statistics 2023-12-22 Shengyu Cao , Simai He , Ruoqing Jiang , Jin Xu , Hongsong Yuan

An extension of the traditional two-armed bandit problem is considered, in which the decision maker has access to some side information before deciding which arm to pull. At each time t, before making a selection, the decision maker is able…

Information Theory · Computer Science 2007-07-16 Chih-Chun Wang , Sanjeev R. Kulkarni , H. Vincent Poor

In this work I study the problem of adversarial perturbations to rewards, in a Multi-armed bandit (MAB) setting. Specifically, I focus on an adversarial attack to a UCB type best-arm identification policy applied to a stochastic MAB. The…

Machine Learning · Computer Science 2022-09-14 Varsha Pendyala

This paper studies the problem of finding an anomalous arm in a multi-armed bandit when (a) each arm is a finite-state Markov process, and (b) the arms are restless. Here, anomaly means that the transition probability matrix (TPM) of one of…

Information Theory · Computer Science 2021-06-02 P. N. Karthik , Rajesh Sundaresan
‹ Prev 1 2 3 10 Next ›