English
Related papers

Related papers: Exponential two-armed bandit problem

200 papers

We consider a continuous time two-armed bandit problem in which incomes are described by Poissonian processes. We develop Bayesian approach with arbitrary prior distribution. We present two versions of recursive equation for determination…

Statistics Theory · Mathematics 2019-07-16 Alexander Kolnogorov

We consider the two-armed bandit problem as applied to data processing if there are two alternative processing methods available with different a priori unknown efficiencies. One should determine the most effective method and provide its…

Statistics Theory · Mathematics 2017-04-13 Alexander V. Kolnogorov

We consider the minimax setup for Gaussian one-armed bandit problem, i.e. the two-armed bandit problem with Gaussian distributions of incomes and known distribution corresponding to the first arm. This setup naturally arises when the…

Statistics Theory · Mathematics 2019-01-28 Alexander Kolnogorov

We study a bandit problem where observations from each arm have an exponential family distribution and different arms are assigned independent conjugate priors. At each of n stages, one arm is to be selected based on past observations. The…

Statistics Theory · Mathematics 2011-03-29 Yaming Yu

We obtain the upper bound of the loss function for a strategy in the multi-armed bandit problem with Gaussian distributions of incomes. Considered strategy is an asymptotic generalization of the strategy proposed by J. Bather for the…

Statistics Theory · Mathematics 2019-02-04 Alexander Kolnogorov , Sergey Garbar

We consider bandit problems involving a large (possibly infinite) collection of arms, in which the expected reward of each arm is a linear function of an $r$-dimensional random vector $\mathbf{Z} \in \mathbb{R}^r$, where $r \geq 2$. The…

Machine Learning · Computer Science 2010-02-24 Paat Rusmevichientong , John N. Tsitsiklis

In this report, we survey Bayesian Optimization methods focussed on the Multi-Armed Bandit Problem. We take the help of the paper "Portfolio Allocation for Bayesian Optimization". We report a small literature survey on the acquisition…

Machine Learning · Computer Science 2020-12-16 Abhilash Nandy , Chandan Kumar , Deepak Mewada , Soumya Sharma

In this paper we consider the two-armed bandit problem, which often naturally appears per se or as a subproblem in some multi-armed generalizations, and serves as a starting point for introducing additional problem features. The…

Optimization and Control · Mathematics 2019-06-26 Peter Jacko

Reinforcement learning studies how to balance exploration and exploitation in real-world systems, optimizing interactions with the world while simultaneously learning how the world operates. One general class of algorithms for such learning…

Machine Learning · Statistics 2018-08-10 Iñigo Urteaga , Chris H. Wiggins

We present a two-armed bandit model of decision making under uncertainty where the expected return to investing in the "risky arm" increases when choosing that arm and decreases when choosing the "safe" arm. These dynamics are natural in…

Optimization and Control · Mathematics 2017-03-22 Roland Fryer , Philipp Harms

We study the recovering bandits problem, a variant of the stochastic multi-armed bandit problem where the expected reward of each arm varies according to some unknown function of the time since the arm was last played. While being a natural…

Machine Learning · Statistics 2019-11-01 Ciara Pike-Burke , Steffen Grünewälder

We consider the multi armed bandit problem in non-stationary environments. Based on the Bayesian method, we propose a variant of Thompson Sampling which can be used in both rested and restless bandit scenarios. Applying discounting to the…

Machine Learning · Statistics 2017-08-01 Vishnu Raj , Sheetal Kalyani

Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations. In this work, we study this problem in the Bayesian setting. We…

Machine Learning · Computer Science 2023-06-16 Alexia Atsidakou , Sumeet Katariya , Sujay Sanghavi , Branislav Kveton

The stochastic multi-armed bandit problem is well understood when the reward distributions are sub-Gaussian. In this paper we examine the bandit problem under the weaker assumption that the distributions have moments of order 1+\epsilon,…

Machine Learning · Statistics 2012-09-11 Sébastien Bubeck , Nicolò Cesa-Bianchi , Gábor Lugosi

This paper revisits the bandit problem in the Bayesian setting. The Bayesian approach formulates the bandit problem as an optimization problem, and the goal is to find the optimal policy which minimizes the Bayesian regret. One of the main…

Optimization and Control · Mathematics 2023-10-03 Yuhua Zhu , Zachary Izzo , Lexing Ying

In this paper, we consider several finite-horizon Bayesian multi-armed bandit problems with side constraints which are computationally intractable (NP-Hard) and for which no optimal (or near optimal) algorithms are known to exist with…

Data Structures and Algorithms · Computer Science 2013-07-18 Sudipto Guha , Kamesh Munagala

This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The existing algorithms for the…

Machine Learning · Statistics 2023-06-26 Arpan Mukherjee , Ali Tajer

We address the problem of finding the maximizer of a nonlinear smooth function, that can only be evaluated point-wise, subject to constraints on the number of permitted function evaluations. This problem is also known as fixed-budget best…

Machine Learning · Statistics 2013-11-12 Matthew W. Hoffman , Bobak Shahriari , Nando de Freitas

Assuming distributions are Gaussian often facilitates computations that are otherwise intractable. We study the performance of an agent that attains a bounded information ratio with respect to a bandit environment with a Gaussian prior…

Machine Learning · Computer Science 2022-02-23 Yueyang Liu , Adithya M. Devraj , Benjamin Van Roy , Kuang Xu

We consider a bandit problem which involves sequential sampling from two populations (arms). Each arm produces a noisy reward realization which depends on an observable random covariate. The goal is to maximize cumulative expected reward.…

Statistics Theory · Mathematics 2010-03-09 Philippe Rigollet , Assaf Zeevi
‹ Prev 1 2 3 10 Next ›