Related papers: Efficient Multivariate Bandit Algorithm with Path …

Multi-armed Bandit Learning on a Graph

The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm…

Machine Learning · Computer Science 2023-03-21 Tianpeng Zhang , Kasper Johansson , Na Li

Thompson Sampling Algorithms for Mean-Variance Bandits

The multi-armed bandit (MAB) problem is a classical learning task that exemplifies the exploration-exploitation tradeoff. However, standard formulations do not take into account {\em risk}. In online decision making systems, risk is a…

Machine Learning · Computer Science 2020-08-04 Qiuyu Zhu , Vincent Y. F. Tan

Multi-Armed Bandits-Based Optimization of Decision Trees

Decision trees, without appropriate constraints, can easily become overly complex and prone to overfit, capturing noise rather than generalizable patterns. To resolve this problem,pruning operation is a crucial part in optimizing decision…

Machine Learning · Computer Science 2025-08-11 Hasibul Karim Shanto , Umme Ayman Koana , Shadikur Rahman

Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays

We discuss a multiple-play multi-armed bandit (MAB) problem in which several arms are selected at each round. Recently, Thompson sampling (TS), a randomized algorithm with a Bayesian spirit, has attracted much attention for its empirically…

Machine Learning · Statistics 2019-03-22 Junpei Komiyama , Junya Honda , Hiroshi Nakagawa

Multi-agent Multi-armed Bandits with Stochastic Sharable Arm Capacities

Motivated by distributed selection problems, we formulate a new variant of multi-player multi-armed bandit (MAB) model, which captures stochastic arrival of requests to each arm, as well as the policy of allocating requests to players. The…

Artificial Intelligence · Computer Science 2024-08-21 Hong Xie , Jinyu Mo , Defu Lian , Jie Wang , Enhong Chen

Forced Exploration in Bandit Problems

The multi-armed bandit(MAB) is a classical sequential decision problem. Most work requires assumptions about the reward distribution (e.g., bounded), while practitioners may have difficulty obtaining information about these distributions to…

Machine Learning · Computer Science 2023-12-14 Han Qi , Fei Guo , Li Zhu

Optimal Algorithms for Range Searching over Multi-Armed Bandits

This paper studies a multi-armed bandit (MAB) version of the range-searching problem. In its basic form, range searching considers as input a set of points (on the real line) and a collection of (real) intervals. Here, with each specified…

Machine Learning · Computer Science 2021-05-05 Siddharth Barman , Ramakrishnan Krishnamurthy , Saladi Rahul

Global Bandits

Multi-armed bandits (MAB) model sequential decision making problems, in which a learner sequentially chooses arms with unknown reward distributions in order to maximize its cumulative reward. Most of the prior work on MAB assumes that the…

Machine Learning · Computer Science 2018-03-22 Onur Atan , Cem Tekin , Mihaela van der Schaar

Achieving PAC Guarantees in Mechanism Design through Multi-Armed Bandits

We analytically derive a class of optimal solutions to a linear program (LP) for automated mechanism design that satisfies efficiency, incentive compatibility, strong budget balance (SBB), and individual rationality (IR), where SBB and IR…

Computer Science and Game Theory · Computer Science 2025-05-20 Takayuki Osogami , Hirota Kinoshita , Segev Wasserkrug

Thompson Sampling on Asymmetric $\alpha$-Stable Bandits

In algorithm optimization in reinforcement learning, how to deal with the exploration-exploitation dilemma is particularly important. Multi-armed bandit problem can optimize the proposed solutions by changing the reward distribution to…

Machine Learning · Statistics 2022-03-28 Zhendong Shi , Ercan E. Kuruoglu , Xiaoli Wei

Extreme Value Monte Carlo Tree Search for Classical Planning

Despite being successful in board games and reinforcement learning (RL), Monte Carlo Tree Search (MCTS) combined with Multi Armed Bandits (MABs) has seen limited success in domain-independent classical planning until recently. Previous work…

Artificial Intelligence · Computer Science 2026-03-30 Masataro Asai , Stephen Wissow

Dynamic Multi-Arm Bandit Game Based Multi-Agents Spectrum Sharing Strategy Design

For a wireless avionics communication system, a Multi-arm bandit game is mathematically formulated, which includes channel states, strategies, and rewards. The simple case includes only two agents sharing the spectrum which is fully studied…

Signal Processing · Electrical Eng. & Systems 2017-11-15 Jingyang Lu , Lun Li , Dan Shen , Genshe Chen , Bin Jia , Erik Blasch , Khanh Pham

Understanding the stochastic dynamics of sequential decision-making processes: A path-integral analysis of multi-armed bandits

The multi-armed bandit (MAB) model is one of the most classical models to study decision-making in an uncertain environment. In this model, a player chooses one of $K$ possible arms of a bandit machine to play at each time step, where the…

Machine Learning · Computer Science 2023-06-13 Bo Li , Chi Ho Yeung

An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems

For the stochastic multi-armed bandit (MAB) problem from a constrained model that generalizes the classical one, we show that an asymptotic optimality is achievable by a simple strategy extended from the $\epsilon_t$-greedy strategy. We…

Optimization and Control · Mathematics 2018-05-04 Hyeong Soo Chang

Lenient Regret for Multi-Armed Bandits

We consider the Multi-Armed Bandit (MAB) problem, where an agent sequentially chooses actions and observes rewards for the actions it took. While the majority of algorithms try to minimize the regret, i.e., the cumulative difference between…

Machine Learning · Computer Science 2021-09-14 Nadav Merlis , Shie Mannor

Combinatorial Multi-armed Bandits for Real-Time Strategy Games

Games with large branching factors pose a significant challenge for game tree search algorithms. In this paper, we address this problem with a sampling strategy for Monte Carlo Tree Search (MCTS) algorithms called {\em na\"{i}ve sampling},…

Artificial Intelligence · Computer Science 2017-10-16 Santiago Ontañón

Solving Inverse Problem for Multi-armed Bandits via Convex Optimization

We consider the inverse problem of multi-armed bandits (IMAB) that are widely used in neuroscience and psychology research for behavior modelling. We first show that the IMAB problem is not convex in general, but can be relaxed to a convex…

Computational Engineering, Finance, and Science · Computer Science 2025-06-27 Hao Zhu , Joschka Boedecker

Thompson Sampling with Virtual Helping Agents

We address the problem of online sequential decision making, i.e., balancing the trade-off between exploiting the current knowledge to maximize immediate performance and exploring the new information to gain long-term benefits using the…

Machine Learning · Computer Science 2022-09-20 Kartik Anand Pant , Amod Hegde , K. V. Srinivas

Multi-armed Bandits with Cost Subsidy

In this paper, we consider a novel variant of the multi-armed bandit (MAB) problem, MAB with cost subsidy, which models many real-life applications where the learning agent has to pay to select an arm and is concerned about optimizing…

Machine Learning · Computer Science 2021-03-16 Deeksha Sinha , Karthik Abinav Sankararama , Abbas Kazerouni , Vashist Avadhanula

Survival Multiarmed Bandits with Bootstrapping Methods

The Multiarmed Bandits (MAB) problem has been extensively studied and has seen many practical applications in a variety of fields. The Survival Multiarmed Bandits (S-MAB) open problem is an extension which constrains an agent to a budget…

Machine Learning · Computer Science 2024-11-06 Peter Veroutis , Frédéric Godin