English
Related papers

Related papers: Max-Utility Based Arm Selection Strategy For Seque…

200 papers

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs). Given a set of expert policies trained on a state and action space, the goal is to maximize the cumulative reward of…

Systems and Control · Computer Science 2017-07-19 Eric Mazumdar , Roy Dong , Vicenç Rúbies Royo , Claire Tomlin , S. Shankar Sastry

The multi-armed bandit (MAB) model is one of the most classical models to study decision-making in an uncertain environment. In this model, a player chooses one of $K$ possible arms of a bandit machine to play at each time step, where the…

Machine Learning · Computer Science 2023-06-13 Bo Li , Chi Ho Yeung

Although the classical version of the Multi-Armed Bandits (MAB) framework has been applied successfully to several practical problems, in many real-world applications, the possible actions are not presented to the learner simultaneously,…

Machine Learning · Computer Science 2021-10-01 Marco Gabrielli , Francesco Trovò , Manuela Antonelli

Multi-armed bandits (MAB) model sequential decision making problems, in which a learner sequentially chooses arms with unknown reward distributions in order to maximize its cumulative reward. Most of the prior work on MAB assumes that the…

Machine Learning · Computer Science 2018-03-22 Onur Atan , Cem Tekin , Mihaela van der Schaar

We consider a resource-aware variant of the classical multi-armed bandit problem: In each round, the learner selects an arm and determines a resource limit. It then observes a corresponding (random) reward, provided the (random) amount of…

Machine Learning · Computer Science 2022-10-18 Viktor Bengs , Eyke Hüllermeier

The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm…

Machine Learning · Computer Science 2023-03-21 Tianpeng Zhang , Kasper Johansson , Na Li

Personalized recommender systems suffuse modern life, shaping what media we read and what products we consume. Algorithms powering such systems tend to consist of supervised learning-based heuristics, such as latent factor models with a…

Information Retrieval · Computer Science 2023-04-19 Liu Leqi , Giulio Zhou , Fatma Kılınç-Karzan , Zachary C. Lipton , Alan L. Montgomery

This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e. those sequential selection techniques able to learn online using only the feedback given by the chosen option (a.k.a. $arm$). We study a particular case of the rested…

Machine Learning · Statistics 2024-11-28 Marco Fiandri , Alberto Maria Metelli , Francesco Trov`o

We propose an algorithm for next query recommendation in interactive data exploration settings, like knowledge discovery for information gathering. The state-of-the-art query recommendation algorithms are based on sequence-to-sequence…

Information Retrieval · Computer Science 2024-07-08 Shameem A Puthiya Parambath , Christos Anagnostopoulos , Roderick Murray-Smith

Strategic behavior against sequential learning methods, such as "click framing" in real recommendation systems, have been widely observed. Motivated by such behavior we study the problem of combinatorial multi-armed bandits (CMAB) under…

Machine Learning · Computer Science 2021-11-22 Jing Dong , Ke Li , Shuai Li , Baoxiang Wang

In this paper, we consider a novel variant of the multi-armed bandit (MAB) problem, MAB with cost subsidy, which models many real-life applications where the learning agent has to pay to select an arm and is concerned about optimizing…

Machine Learning · Computer Science 2021-03-16 Deeksha Sinha , Karthik Abinav Sankararama , Abbas Kazerouni , Vashist Avadhanula

Motivated by distributed selection problems, we formulate a new variant of multi-player multi-armed bandit (MAB) model, which captures stochastic arrival of requests to each arm, as well as the policy of allocating requests to players. The…

Artificial Intelligence · Computer Science 2024-08-21 Hong Xie , Jinyu Mo , Defu Lian , Jie Wang , Enhong Chen

We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms. In each round, a super arm is played and the base arms contained in…

Machine Learning · Computer Science 2016-03-30 Wei Chen , Yajun Wang , Yang Yuan , Qinshi Wang

Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative rewards in sequential decision tasks such as news article recommendation systems, web page ad placement algorithms, and mobile health.…

Machine Learning · Statistics 2019-02-01 Gi-Soo Kim , Myunghee Cho Paik

In this paper, we investigate the impact of diverse user preference on learning under the stochastic multi-armed bandit (MAB) framework. We aim to show that when the user preferences are sufficiently diverse and each arm can be optimal for…

Machine Learning · Computer Science 2022-11-11 Chao Gan , Jing Yang , Ruida Zhou , Cong Shen

We consider the classic online learning and stochastic multi-armed bandit (MAB) problems, when at each step, the online policy can probe and find out which of a small number ($k$) of choices has better reward (or loss) before making its…

Data Structures and Algorithms · Computer Science 2022-11-08 Aditya Bhaskara , Sreenivas Gollapudi , Sungjin Im , Kostas Kollias , Kamesh Munagala

Multi-armed bandit (MAB) algorithms have achieved significant success in sequential decision-making applications, under the premise that humans perfectly implement the recommended policy. However, existing methods often overlook the crucial…

Machine Learning · Statistics 2024-10-07 Changxiao Cai , Jiacheng Zhang

Combinatorial bandits extend the classical bandit framework to settings where the learner selects multiple arms in each round, motivated by applications such as online recommendation and assortment optimization. While extensions of upper…

Machine Learning · Computer Science 2025-10-29 Yuxiao Wen , Yanjun Han , Zhengyuan Zhou

Standard Multi-Armed Bandit (MAB) problems assume that the arms are independent. However, in many application scenarios, the information obtained by playing an arm provides information about the remainder of the arms. Hence, in such…

Machine Learning · Computer Science 2014-10-30 Onur Atan , Cem Tekin , Mihaela van der Schaar

E-commerce sites strive to provide users the most timely relevant information in order to reduce shopping frictions and increase customer satisfaction. Multi armed bandit models (MAB) as a type of adaptive optimization algorithms provide…

Information Retrieval · Computer Science 2021-08-23 Ding Xiang , Becky West , Jiaqi Wang , Xiquan Cui , Jinzhou Huang
‹ Prev 1 2 3 10 Next ›