English
Related papers

Related papers: Combinatorial Multi-Objective Multi-Armed Bandit P…

200 papers

Multi-armed bandit (MAB) problems are widely applied to online optimization tasks that require balancing exploration and exploitation. In practical scenarios, these tasks often involve multiple conflicting objectives, giving rise to…

Machine Learning · Computer Science 2025-06-17 Mansoor Davoodi , Setareh Maghsudi

In this paper, we propose a new multi-objective contextual multi-armed bandit (MAB) problem with two objectives, where one of the objectives dominates the other objective. Unlike single-objective MAB problems in which the learner obtains a…

Machine Learning · Computer Science 2018-06-04 Cem Tekin , Eralp Turgay

In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives. MOB has found many real-world applications as varied…

Machine Learning · Computer Science 2019-05-31 Shiyin Lu , Guanghui Wang , Yao Hu , Lijun Zhang

In this paper, we provide the first investigation into adaptive combinatorial experimental design, focusing on the trade-off between regret minimization and statistical power in combinatorial multi-armed bandits (CMAB). While minimizing…

Machine Learning · Computer Science 2026-03-02 Hongrui Xie , Junyu Cao , Kan Xu

In this paper we propose the multi-objective contextual bandit problem with similarity information. This problem extends the classical contextual bandit problem with similarity information by introducing multiple and possibly conflicting…

Machine Learning · Statistics 2018-03-13 Eralp Turğay , Doruk Öner , Cem Tekin

A matching platform is a system that matches different types of participants, such as companies and job-seekers. In such a platform, merely maximizing the number of matches can result in matches being concentrated on highly popular…

Machine Learning · Computer Science 2026-03-10 Yuki Shibukawa , Koichi Tanaka , Yuta Saito , Shinji Ito

Strategic behavior against sequential learning methods, such as "click framing" in real recommendation systems, have been widely observed. Motivated by such behavior we study the problem of combinatorial multi-armed bandits (CMAB) under…

Machine Learning · Computer Science 2021-11-22 Jing Dong , Ke Li , Shuai Li , Baoxiang Wang

Multi-objective multi-armed bandit (MO-MAB) problems traditionally aim to achieve Pareto optimality. However, real-world scenarios often involve users with varying preferences across objectives, resulting in a Pareto-optimal arm that may…

Machine Learning · Computer Science 2025-11-18 Linfeng Cao , Ming Shi , Ness B. Shroff

We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms. In each round, a super arm is played and the base arms contained in…

Machine Learning · Computer Science 2016-03-30 Wei Chen , Yajun Wang , Yang Yuan , Qinshi Wang

The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm…

Machine Learning · Computer Science 2023-03-21 Tianpeng Zhang , Kasper Johansson , Na Li

The multi-armed bandit (MAB) model has been widely adopted for studying many practical optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with unknown parameters. The goal of the player here is to…

Machine Learning · Computer Science 2019-11-21 Fengjiao Li , Jia Liu , Bo Ji

In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of interest is regret, defined as the gap between the expected…

Optimization and Control · Mathematics 2010-11-23 Yi Gai , Bhaskar Krishnamachari , Rahul Jain

We consider a multiobjective multiarmed bandit problem with lexicographically ordered objectives. In this problem, the goal of the learner is to select arms that are lexicographic optimal as much as possible without knowing the arm reward…

Machine Learning · Computer Science 2019-07-30 Alihan Hüyük , Cem Tekin

We study multi-objective multi-agent multi-armed bandits (MO-MA-MAB) under stochastic rewards, where agents observe heterogeneous reward vectors and communicate over time-varying graphs. We formulate this emerging problem setting to address…

Machine Learning · Computer Science 2026-05-11 John Wang , Mengfan Xu

Multi-armed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multi-armed bandits (FMAB) is an emerging…

Machine Learning · Computer Science 2025-02-18 Artun Saday , İlker Demirel , Yiğit Yıldırım , Cem Tekin

Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed Bandits (COM-MAB) show good results on a global accuracy metric. This can be achieved, in the case of recommender systems, with personalization. However, with a…

Machine Learning · Computer Science 2020-09-17 Alexandre Letard , Tassadit Amghar , Olivier Camp , Nicolas Gutowski

We consider a resource-aware variant of the classical multi-armed bandit problem: In each round, the learner selects an arm and determines a resource limit. It then observes a corresponding (random) reward, provided the (random) amount of…

Machine Learning · Computer Science 2022-10-18 Viktor Bengs , Eyke Hüllermeier

We study a multi-objective multi-armed bandit problem in a dynamic environment. The problem portrays a decision-maker that sequentially selects an arm from a given set. If selected, each action produces a reward vector, where every element…

Machine Learning · Computer Science 2023-02-14 Amir Rezaei Balef , Setareh Maghsudi

We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings.…

Machine Learning · Computer Science 2023-06-01 Mengfan Xu , Diego Klabjan

The multi-armed bandit (MAB) problem is an active learning framework that aims to select the best among a set of actions by sequentially observing rewards. Recently, it has become popular for a number of applications over wireless networks,…

Machine Learning · Computer Science 2021-11-12 Osama A. Hanna , Lin F. Yang , Christina Fragouli
‹ Prev 1 2 3 10 Next ›