English
Related papers

Related papers: Approximation Algorithms for Bayesian Multi-Armed …

200 papers

The restless bandit problem is one of the most well-studied generalizations of the celebrated stochastic multi-armed bandit problem in decision theory. In its ultimate generality, the restless bandit problem is known to be PSPACE-Hard to…

Data Structures and Algorithms · Computer Science 2009-02-03 Sudipto Guha , Kamesh Munagala , Peng Shi

Restless bandits are an important class of problems with applications in recommender systems, active learning, revenue management and other areas. We consider infinite-horizon discounted restless bandits with many arms where a fixed…

Machine Learning · Computer Science 2022-03-31 Xiangyu Zhang , Peter I. Frazier

We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when \emph{the arms are subject to linear constraints}. Unlike the standard best-arm identification problem which is well…

Machine Learning · Computer Science 2024-01-26 Emil Carlsson , Debabrota Basu , Fredrik D. Johansson , Devdatt Dubhashi

We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal…

Machine Learning · Computer Science 2019-12-02 N. Mert Vural , Hakan Gokcesu , Kaan Gokcesu , Suleyman S. Kozat

We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several sources (arms) of items (rewards), and interested in finding the best item overall. At each time step the agent chooses an arm, and obtains a random…

Machine Learning · Statistics 2015-08-25 Yahel David , Nahum Shimkin

In this report, we survey Bayesian Optimization methods focussed on the Multi-Armed Bandit Problem. We take the help of the paper "Portfolio Allocation for Bayesian Optimization". We report a small literature survey on the acquisition…

Machine Learning · Computer Science 2020-12-16 Abhilash Nandy , Chandan Kumar , Deepak Mewada , Soumya Sharma

We study the multi-armed bandit problem with arms which are Markov chains with rewards. In the finite-horizon setting, the celebrated Gittins indices do not apply, and the exact solution is intractable. We provide approximation algorithms…

Data Structures and Algorithms · Computer Science 2016-09-14 Will Ma

In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activate $K \geq 1$ arms at each time in order…

Optimization and Control · Mathematics 2011-12-25 Wenhan Dai , Yi Gai , Bhaskar Krishnamachari , Qing Zhao

We give nearly-tight upper and lower bounds for the improving multi-armed bandits problem. An instance of this problem has $k$ arms, each of whose reward function is a concave and increasing function of the number of times that arm has been…

Machine Learning · Computer Science 2024-04-02 Avrim Blum , Kavya Ravichandran

Multi-player multi-armed bandit is an increasingly relevant decision-making problem, motivated by applications to cognitive radio systems. Most research for this problem focuses exclusively on the settings that players have \textit{full…

Machine Learning · Computer Science 2022-12-14 Guojun Xiong , Jian Li

In this paper, we introduce a multi-armed bandit problem termed max-min grouped bandits, in which the arms are arranged in possibly-overlapping groups, and the goal is to find the group whose worst arm has the highest mean reward. This…

Machine Learning · Statistics 2022-03-16 Zhenlin Wang , Jonathan Scarlett

In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activate $K \geq 1$ arms at each time in order…

Optimization and Control · Mathematics 2010-11-23 Wenhan Dai , Yi Gai , Bhaskar Krishnamachari , Qing Zhao

We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several stochastic arms, each a source of i.i.d. rewards of unknown distribution. At each time step the agent chooses an arm, and observes the reward of the…

Machine Learning · Statistics 2015-12-25 Yahel David , Nahum Shimkin

This paper presents an efficient algorithm to solve the sleeping bandit with multiple plays problem in the context of an online recommendation system. The problem involves bounded, adversarial loss and unknown i.i.d. distributions for arm…

Machine Learning · Computer Science 2023-07-28 Jianjun Yuan , Wei Lee Woon , Ludovik Coba

This paper introduces a general multi-agent bandit model in which each agent is facing a finite set of arms and may communicate with other agents through a central controller in order to identify, in pure exploration, or play, in regret…

Machine Learning · Computer Science 2022-10-31 Clémence Réda , Sattar Vakili , Emilie Kaufmann

We study a grouped bandit setting where each arm comprises multiple independent sub-arms referred to as attributes. Each attribute of each arm has an independent stochastic reward. We impose the constraint that for an arm to be deemed…

Machine Learning · Computer Science 2024-12-12 Sahil Dharod , Malyala Preethi Sravani , Sakshi Heda , Sharayu Moharir

We consider a novel stochastic multi-armed bandit setting, where playing an arm makes it unavailable for a fixed number of time slots thereafter. This models situations where reusing an arm too often is undesirable (e.g. making the same…

Machine Learning · Computer Science 2024-07-31 Soumya Basu , Rajat Sen , Sujay Sanghavi , Sanjay Shakkottai

This paper revisits the bandit problem in the Bayesian setting. The Bayesian approach formulates the bandit problem as an optimization problem, and the goal is to find the optimal policy which minimizes the Bayesian regret. One of the main…

Optimization and Control · Mathematics 2023-10-03 Yuhua Zhu , Zachary Izzo , Lexing Ying

The Greedy algorithm is the simplest heuristic in sequential decision problem that carelessly takes the locally optimal choice at each round, disregarding any advantages of exploring and/or information gathering. Theoretically, it is known…

Machine Learning · Computer Science 2021-01-05 Matthieu Jedor , Jonathan Louëdec , Vianney Perchet

We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which…

Machine Learning · Statistics 2018-05-22 Maryam Aziz , Jesse Anderton , Emilie Kaufmann , Javed Aslam
‹ Prev 1 2 3 10 Next ›