Related papers: Variable Selection via Thompson Sampling

A study of Thompson Sampling with Parameter h

Thompson Sampling algorithm is a well known Bayesian algorithm for solving stochastic multi-armed bandit. At each time step the algorithm chooses each arm with probability proportional to it being the current best arm. We modify the…

Machine Learning · Computer Science 2017-10-09 Qiang Ha

Thompson Sampling on Asymmetric $\alpha$-Stable Bandits

In algorithm optimization in reinforcement learning, how to deal with the exploration-exploitation dilemma is particularly important. Multi-armed bandit problem can optimize the proposed solutions by changing the reward distribution to…

Machine Learning · Statistics 2022-03-28 Zhendong Shi , Ercan E. Kuruoglu , Xiaoli Wei

Thompson Sampling for Unsupervised Sequential Selection

Thompson Sampling has generated significant interest due to its better empirical performance than upper confidence bound based algorithms. In this paper, we study Thompson Sampling based algorithm for Unsupervised Sequential Selection (USS)…

Machine Learning · Computer Science 2020-09-17 Arun Verma , Manjesh K. Hanawal , Nandyala Hemachandra

Thompson Sampling on Symmetric $\alpha$-Stable Bandits

Thompson Sampling provides an efficient technique to introduce prior knowledge in the multi-armed bandit problem, along with providing remarkable empirical performance. In this paper, we revisit the Thompson Sampling algorithm under rewards…

Machine Learning · Computer Science 2019-12-09 Abhimanyu Dubey , Alex Pentland

Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection

Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real…

Machine Learning · Computer Science 2013-02-18 Joseph Mellor , Jonathan Shapiro

Thompson Sampling via Local Uncertainty

Thompson sampling is an efficient algorithm for sequential decision making, which exploits the posterior uncertainty to address the exploration-exploitation dilemma. There has been significant recent interest in integrating Bayesian neural…

Machine Learning · Statistics 2020-08-07 Zhendong Wang , Mingyuan Zhou

Fast, Precise Thompson Sampling for Bayesian Optimization

Thompson sampling (TS) has optimal regret and excellent empirical performance in multi-armed bandit problems. Yet, in Bayesian optimization, TS underperforms popular acquisition functions (e.g., EI, UCB). TS samples arms according to the…

Machine Learning · Statistics 2024-12-02 David Sweet

Thompson Sampling with Virtual Helping Agents

We address the problem of online sequential decision making, i.e., balancing the trade-off between exploiting the current knowledge to maximize immediate performance and exploring the new information to gain long-term benefits using the…

Machine Learning · Computer Science 2022-09-20 Kartik Anand Pant , Amod Hegde , K. V. Srinivas

Thompson Sampling for Bandits with Clustered Arms

We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant with linear expected rewards, in the setting where arms are clustered. We show, both theoretically and…

Machine Learning · Computer Science 2022-06-16 Emil Carlsson , Devdatt Dubhashi , Fredrik D. Johansson

Taming Non-stationary Bandits: A Bayesian Approach

We consider the multi armed bandit problem in non-stationary environments. Based on the Bayesian method, we propose a variant of Thompson Sampling which can be used in both rested and restless bandit scenarios. Applying discounting to the…

Machine Learning · Statistics 2017-08-01 Vishnu Raj , Sheetal Kalyani

Thompson Sampling for Contextual Bandits with Linear Payoffs

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several studies demonstrated it to have better…

Machine Learning · Computer Science 2014-02-04 Shipra Agrawal , Navin Goyal

Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits

We consider a Bayesian budgeted multi-armed bandit problem, in which each arm consumes a different amount of resources when selected and there is a budget constraint on the total amount of resources that can be used. Budgeted Thompson…

Machine Learning · Computer Science 2024-08-29 Woojin Jeong , Seungki Min

Variational Bayesian Optimistic Sampling

We consider online sequential decision problems where an agent must balance exploration and exploitation. We derive a set of Bayesian `optimistic' policies which, in the stochastic multi-armed bandit case, includes the Thompson sampling…

Machine Learning · Statistics 2021-11-01 Brendan O'Donoghue , Tor Lattimore

Neural Thompson Sampling

Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-armed bandit problems. In this paper, we propose a new algorithm, called Neural Thompson Sampling, which adapts deep neural networks for both…

Machine Learning · Computer Science 2022-01-03 Weitong Zhang , Dongruo Zhou , Lihong Li , Quanquan Gu

Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits

Contextual multi-armed bandits are classical models in reinforcement learning for sequential decision-making associated with individual information. A widely-used policy for bandits is Thompson Sampling, where samples from a data-driven…

Machine Learning · Statistics 2021-11-30 Hongju Park , Mohamad Kazem Shirani Faradonbeh

Thompson sampling with the online bootstrap

Thompson sampling provides a solution to bandit problems in which new observations are allocated to arms with the posterior probability that an arm is optimal. While sometimes easy to implement and asymptotically optimal, Thompson sampling…

Machine Learning · Computer Science 2014-10-16 Dean Eckles , Maurits Kaptein

Thompson Sampling Algorithms for Mean-Variance Bandits

The multi-armed bandit (MAB) problem is a classical learning task that exemplifies the exploration-exploitation tradeoff. However, standard formulations do not take into account {\em risk}. In online decision making systems, risk is a…

Machine Learning · Computer Science 2020-08-04 Qiuyu Zhu , Vincent Y. F. Tan

Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays

We discuss a multiple-play multi-armed bandit (MAB) problem in which several arms are selected at each round. Recently, Thompson sampling (TS), a randomized algorithm with a Bayesian spirit, has attracted much attention for its empirically…

Machine Learning · Statistics 2019-03-22 Junpei Komiyama , Junya Honda , Hiroshi Nakagawa

Variational inference for the multi-armed contextual bandit

In many biomedical, science, and engineering problems, one must sequentially decide which action to take next so as to maximize rewards. One general class of algorithms for optimizing interactions with the world, while simultaneously…

Machine Learning · Statistics 2021-05-05 Iñigo Urteaga , Chris H. Wiggins

TSEC: a framework for online experimentation under experimental constraints

Thompson sampling is a popular algorithm for solving multi-armed bandit problems, and has been applied in a wide range of applications, from website design to portfolio optimization. In such applications, however, the number of choices (or…

Methodology · Statistics 2021-01-19 Simon Mak , Yuanshuo Zhou , Lavonne Hoang , C. F. Jeff Wu