Related papers: Combinatorial Multi-Objective Multi-Armed Bandit P…

Stochastic Multi-Objective Multi-Armed Bandits: Regret Definition and Algorithm

Multi-armed bandit (MAB) problems are widely applied to online optimization tasks that require balancing exploration and exploitation. In practical scenarios, these tasks often involve multiple conflicting objectives, giving rise to…

Machine Learning · Computer Science 2025-06-17 Mansoor Davoodi , Setareh Maghsudi

Multi-objective Contextual Multi-armed Bandit with a Dominant Objective

In this paper, we propose a new multi-objective contextual multi-armed bandit (MAB) problem with two objectives, where one of the objectives dominates the other objective. Unlike single-objective MAB problems in which the learner obtains a…

Machine Learning · Computer Science 2018-06-04 Cem Tekin , Eralp Turgay

Multi-Objective Generalized Linear Bandits

In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives. MOB has found many real-world applications as varied…

Machine Learning · Computer Science 2019-05-31 Shiyin Lu , Guanghui Wang , Yao Hu , Lijun Zhang

Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference

In this paper, we provide the first investigation into adaptive combinatorial experimental design, focusing on the trade-off between regret minimization and statistical power in combinatorial multi-armed bandits (CMAB). While minimizing…

Machine Learning · Computer Science 2026-03-02 Hongrui Xie , Junyu Cao , Kan Xu

Multi-objective Contextual Bandit Problem with Similarity Information

In this paper we propose the multi-objective contextual bandit problem with similarity information. This problem extends the classical contextual bandit problem with similarity information by introducing multiple and possibly conflicting…

Machine Learning · Statistics 2018-03-13 Eralp Turğay , Doruk Öner , Cem Tekin

Combinatorial Allocation Bandits with Nonlinear Arm Utility

A matching platform is a system that matches different types of participants, such as companies and job-seekers. In such a platform, merely maximizing the number of matches can result in matches being concentrated on highly popular…

Machine Learning · Computer Science 2026-03-10 Yuki Shibukawa , Koichi Tanaka , Yuta Saito , Shinji Ito

Combinatorial Bandits under Strategic Manipulations

Strategic behavior against sequential learning methods, such as "click framing" in real recommendation systems, have been widely observed. Motivated by such behavior we study the problem of combinatorial multi-armed bandits (CMAB) under…

Machine Learning · Computer Science 2021-11-22 Jing Dong , Ke Li , Shuai Li , Baoxiang Wang

Provably Efficient Multi-Objective Bandit Algorithms under Preference-Centric Customization

Multi-objective multi-armed bandit (MO-MAB) problems traditionally aim to achieve Pareto optimality. However, real-world scenarios often involve users with varying preferences across objectives, resulting in a Pareto-optimal arm that may…

Machine Learning · Computer Science 2025-11-18 Linfeng Cao , Ming Shi , Ness B. Shroff

Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms

We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms. In each round, a super arm is played and the base arms contained in…

Machine Learning · Computer Science 2016-03-30 Wei Chen , Yajun Wang , Yang Yuan , Qinshi Wang

Multi-armed Bandit Learning on a Graph

The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm…

Machine Learning · Computer Science 2023-03-21 Tianpeng Zhang , Kasper Johansson , Na Li

Combinatorial Sleeping Bandits with Fairness Constraints

The multi-armed bandit (MAB) model has been widely adopted for studying many practical optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with unknown parameters. The goal of the player here is to…

Machine Learning · Computer Science 2019-11-21 Fengjiao Li , Jia Liu , Bo Ji

Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards

In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of interest is regret, defined as the gap between the expected…

Optimization and Control · Mathematics 2010-11-23 Yi Gai , Bhaskar Krishnamachari , Rahul Jain

Lexicographic Multiarmed Bandit

We consider a multiobjective multiarmed bandit problem with lexicographically ordered objectives. In this problem, the goal of the learner is to select arms that are lexicographic optimal as much as possible without knowing the arm reward…

Machine Learning · Computer Science 2019-07-30 Alihan Hüyük , Cem Tekin

Multi-Objective Multi-Agent Bandits: From Learning Efficiency to Fairness Optimization

We study multi-objective multi-agent multi-armed bandits (MO-MA-MAB) under stochastic rewards, where agents observe heterogeneous reward vectors and communicate over time-varying graphs. We formulate this emerging problem setting to address…

Machine Learning · Computer Science 2026-05-11 John Wang , Mengfan Xu

Federated Multi-Armed Bandits Under Byzantine Attacks

Multi-armed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multi-armed bandits (FMAB) is an emerging…

Machine Learning · Computer Science 2025-02-18 Artun Saday , İlker Demirel , Yiğit Yıldırım , Cem Tekin

Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed Bandits (COM-MAB) show good results on a global accuracy metric. This can be achieved, in the case of recommender systems, with personalization. However, with a…

Machine Learning · Computer Science 2020-09-17 Alexandre Letard , Tassadit Amghar , Olivier Camp , Nicolas Gutowski

Multi-Armed Bandits with Censored Consumption of Resources

We consider a resource-aware variant of the classical multi-armed bandit problem: In each round, the learner selects an arm and determines a resource limit. It then observes a corresponding (random) reward, provided the (random) amount of…

Machine Learning · Computer Science 2022-10-18 Viktor Bengs , Eyke Hüllermeier

Piecewise-Stationary Multi-Objective Multi-Armed Bandit with Application to Joint Communications and Sensing

We study a multi-objective multi-armed bandit problem in a dynamic environment. The problem portrays a decision-maker that sequentially selects an arm from a given set. If selected, each action produces a reward vector, where every element…

Machine Learning · Computer Science 2023-02-14 Amir Rezaei Balef , Setareh Maghsudi

Pareto Regret Analyses in Multi-objective Multi-armed Bandit

We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings.…

Machine Learning · Computer Science 2023-06-01 Mengfan Xu , Diego Klabjan

Solving Multi-Arm Bandit Using a Few Bits of Communication

The multi-armed bandit (MAB) problem is an active learning framework that aims to select the best among a set of actions by sequentially observing rewards. Recently, it has become popular for a number of applications over wireless networks,…

Machine Learning · Computer Science 2021-11-12 Osama A. Hanna , Lin F. Yang , Christina Fragouli