Related papers: Distributed Thompson Sampling

Bayesian Algorithms for Decentralized Stochastic Bandits

We study a decentralized cooperative multi-agent multi-armed bandit problem with $K$ arms and $N$ agents connected over a network. In our model, each arm's reward distribution is same for all agents, and rewards are drawn independently…

Machine Learning · Statistics 2020-10-29 Anusha Lalitha , Andrea Goldsmith

Thompson Sampling for Budgeted Multi-armed Bandits

Thompson sampling is one of the earliest randomized algorithms for multi-armed bandits (MAB). In this paper, we extend the Thompson sampling to Budgeted MAB, where there is random cost for pulling an arm and the total cost is constrained by…

Machine Learning · Computer Science 2015-05-04 Yingce Xia , Haifang Li , Tao Qin , Nenghai Yu , Tie-Yan Liu

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of $N$ agents such that each agent is learning one of $M$…

Machine Learning · Computer Science 2024-07-04 Ronshee Chawla , Daniel Vial , Sanjay Shakkottai , R. Srikant

Thompson Sampling with Virtual Helping Agents

We address the problem of online sequential decision making, i.e., balancing the trade-off between exploiting the current knowledge to maximize immediate performance and exploring the new information to gain long-term benefits using the…

Machine Learning · Computer Science 2022-09-20 Kartik Anand Pant , Amod Hegde , K. V. Srinivas

Asymptotically Optimal Bandits under Weighted Information

We study the problem of regret minimization in a multi-armed bandit setup where the agent is allowed to play multiple arms at each round by spreading the resources usually allocated to only one arm. At each iteration the agent selects a…

Machine Learning · Computer Science 2021-06-01 Matias I. Müller , Cristian R. Rojas

Social Learning in Multi Agent Multi Armed Bandits

In this paper, we introduce a distributed version of the classical stochastic Multi-Arm Bandit (MAB) problem. Our setting consists of a large number of agents $n$ that collaboratively and simultaneously solve the same instance of $K$ armed…

Machine Learning · Computer Science 2019-11-06 Abishek Sankararaman , Ayalvadi Ganesh , Sanjay Shakkottai

Neural Thompson Sampling

Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-armed bandit problems. In this paper, we propose a new algorithm, called Neural Thompson Sampling, which adapts deep neural networks for both…

Machine Learning · Computer Science 2022-01-03 Weitong Zhang , Dongruo Zhou , Lihong Li , Quanquan Gu

Thompson Sampling for Unimodal Bandits

In this paper, we propose a Thompson Sampling algorithm for \emph{unimodal} bandits, where the expected reward is unimodal over the partially ordered arms. To exploit the unimodal structure better, at each step, instead of exploration from…

Machine Learning · Computer Science 2021-06-17 Long Yang , Zhao Li , Zehong Hu , Shasha Ruan , Shijian Li , Gang Pan , Hongyang Chen

Decentralized Cooperative Stochastic Bandits

We study a decentralized cooperative stochastic multi-armed bandit problem with $K$ arms on a network of $N$ agents. In our model, the reward distribution of each arm is the same for each agent and rewards are drawn independently across…

Machine Learning · Computer Science 2019-10-25 David Martínez-Rubio , Varun Kanade , Patrick Rebeschini

Thompson Sampling for Bandits with Clustered Arms

We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant with linear expected rewards, in the setting where arms are clustered. We show, both theoretically and…

Machine Learning · Computer Science 2022-06-16 Emil Carlsson , Devdatt Dubhashi , Fredrik D. Johansson

Thompson Sampling for Combinatorial Semi-Bandits

In this paper, we study the application of the Thompson sampling (TS) methodology to the stochastic combinatorial multi-armed bandit (CMAB) framework. We first analyze the standard TS algorithm for the general CMAB model when the outcome…

Machine Learning · Computer Science 2022-06-22 Siwei Wang , Wei Chen

Batched Thompson Sampling for Multi-Armed Bandits

We study Thompson Sampling algorithms for stochastic multi-armed bandits in the batched setting, in which we want to minimize the regret over a sequence of arm pulls using a small number of policy changes (or, batches). We propose two…

Machine Learning · Computer Science 2021-08-17 Nikolai Karpov , Qin Zhang

Analysis of Thompson Sampling for the multi-armed bandit problem

The multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in sequential decision problems. Many algorithms are now available for this well-studied problem. One of the earliest algorithms, given by W.…

Machine Learning · Computer Science 2012-04-10 Shipra Agrawal , Navin Goyal

Further Optimal Regret Bounds for Thompson Sampling

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several studies demonstrated it to have better…

Machine Learning · Computer Science 2012-09-18 Shipra Agrawal , Navin Goyal

Optimal Regret Bounds for Collaborative Learning in Bandits

We consider regret minimization in a general collaborative multi-agent multi-armed bandit model, in which each agent faces a finite set of arms and may communicate with other agents through a central controller. The optimal arm for each…

Machine Learning · Computer Science 2023-12-18 Amitis Shidani , Sattar Vakili

Collaborative Multi-agent Stochastic Linear Bandits

We study a collaborative multi-agent stochastic linear bandit setting, where $N$ agents that form a network communicate locally to minimize their overall regret. In this setting, each agent has its own linear bandit problem (its own reward…

Machine Learning · Computer Science 2022-05-16 Ahmadreza Moradipari , Mohammad Ghavamzadeh , Mahnoosh Alizadeh

Cooperative Multi-agent Bandits: Distributed Algorithms with Optimal Individual Regret and Constant Communication Costs

Recently, there has been extensive study of cooperative multi-agent multi-armed bandits where a set of distributed agents cooperatively play the same multi-armed bandit game. The goal is to develop bandit algorithms with the optimal group…

Machine Learning · Computer Science 2023-08-09 Lin Yang , Xuchuang Wang , Mohammad Hajiesmaili , Lijun Zhang , John C. S. Lui , Don Towsley

Thompson Sampling Algorithms for Mean-Variance Bandits

The multi-armed bandit (MAB) problem is a classical learning task that exemplifies the exploration-exploitation tradeoff. However, standard formulations do not take into account {\em risk}. In online decision making systems, risk is a…

Machine Learning · Computer Science 2020-08-04 Qiuyu Zhu , Vincent Y. F. Tan

Generalized Thompson Sampling for Contextual Bandits

Thompson Sampling, one of the oldest heuristics for solving multi-armed bandits, has recently been shown to demonstrate state-of-the-art performance. The empirical success has led to great interests in theoretical understanding of this…

Machine Learning · Computer Science 2013-10-29 Lihong Li

Robust Multi-Agent Multi-Armed Bandits

Recent works have shown that agents facing independent instances of a stochastic $K$-armed bandit can collaborate to decrease regret. However, these works assume that each agent always recommends their individual best-arm estimates to other…

Machine Learning · Computer Science 2022-03-02 Daniel Vial , Sanjay Shakkottai , R. Srikant