English
Related papers

Related papers: Bayesian Online Model Selection

200 papers

We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of…

Machine Learning · Computer Science 2022-12-06 Aldo Pacchiano , My Phan , Yasin Abbasi-Yadkori , Anup Rao , Julian Zimmert , Tor Lattimore , Csaba Szepesvari

We introduce the problem of model selection for contextual bandits, where a learner must adapt to the complexity of the optimal policy while balancing exploration and exploitation. Our main result is a new model selection guarantee for…

Machine Learning · Computer Science 2019-11-15 Dylan J. Foster , Akshay Krishnamurthy , Haipeng Luo

We consider model selection for sequential decision making in stochastic environments with bandit feedback, where a meta-learner has at its disposal a pool of base learners, and decides on the fly which action to take based on the policies…

Machine Learning · Computer Science 2024-01-24 Aldo Pacchiano , Christoph Dann , Claudio Gentile

We consider model selection in stochastic bandit and reinforcement learning problems. Given a set of base learning algorithms, an effective model selection strategy adapts to the best learning algorithm in an online fashion. We show that by…

Machine Learning · Computer Science 2020-06-11 Yasin Abbasi-Yadkori , Aldo Pacchiano , My Phan

We study the problem of $K$-armed dueling bandit for both stochastic and adversarial environments, where the goal of the learner is to aggregate information through relative preferences of pair of decisions points queried in an online…

Machine Learning · Computer Science 2022-02-15 Aadirupa Saha , Pierre Gaillard

We study the problem of model selection in bandit scenarios in the presence of nested policy classes, with the goal of obtaining simultaneous adversarial and stochastic ("best of both worlds") high-probability regret guarantees. Our…

Machine Learning · Computer Science 2022-07-01 Aldo Pacchiano , Christoph Dann , Claudio Gentile

Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online…

Machine Learning · Statistics 2023-11-14 Parnian Kassraie , Nicolas Emmenegger , Andreas Krause , Aldo Pacchiano

Combinatorial multi-armed bandits provide a fundamental online decision-making environment where a decision-maker interacts with an environment across $T$ time steps, each time selecting an action and learning the cost of that action. The…

Machine Learning · Computer Science 2026-04-13 Gerdus Benadè , Rathish Das , Thomas Lavastida

We study the problem of online learning in contextual bandit problems where the loss function is assumed to belong to a known parametric function class. We propose a new analytic framework for this setting that bridges the Bayesian theory…

Machine Learning · Computer Science 2024-06-28 Gergely Neu , Matteo Papini , Ludovic Schwartz

We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization…

Machine Learning · Computer Science 2024-02-12 Yunbei Xu , Assaf Zeevi

We consider a special case of bandit problems, namely batched bandits. Motivated by natural restrictions of recommender systems and e-commerce platforms, we assume that a learning agent observes responses batched in groups over a certain…

Machine Learning · Computer Science 2021-11-04 Danil Provodin , Pratik Gajane , Mykola Pechenizkiy , Maurits Kaptein

We develop a meta-learning framework for simple regret minimization in bandits. In this framework, a learning agent interacts with a sequence of bandit tasks, which are sampled i.i.d.\ from an unknown prior distribution, and learns its…

Machine Learning · Computer Science 2023-07-06 Mohammadjavad Azizi , Branislav Kveton , Mohammad Ghavamzadeh , Sumeet Katariya

We study how to make decisions that minimize Bayesian regret in offline linear bandits. Prior work suggests that one must take actions with maximum lower confidence bound (LCB) on their reward. We argue that the reliance on LCB is…

Machine Learning · Computer Science 2024-07-04 Marek Petrik , Guy Tennenholtz , Mohammad Ghavamzadeh

In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round. This problem has found many applications including online advertisement…

Machine Learning · Computer Science 2015-09-28 Lijun Zhang , Tianbao Yang , Rong Jin , Zhi-Hua Zhou

We study model selection in linear bandits, where the learner must adapt to the dimension (denoted by $d_\star$) of the smallest hypothesis class containing the true linear model while balancing exploration and exploitation. Previous papers…

Machine Learning · Statistics 2022-03-17 Yinglun Zhu , Robert Nowak

We consider the classic online learning and stochastic multi-armed bandit (MAB) problems, when at each step, the online policy can probe and find out which of a small number ($k$) of choices has better reward (or loss) before making its…

Data Structures and Algorithms · Computer Science 2022-11-08 Aditya Bhaskara , Sreenivas Gollapudi , Sungjin Im , Kostas Kollias , Kamesh Munagala

We introduce a novel online learning framework that unifies and generalizes pre-established models, such as delayed and corrupted feedback, to encompass adversarial environments where action feedback evolves over time. In this setting, the…

Machine Learning · Computer Science 2024-05-28 Yogev Bar-On , Yishay Mansour

Most bandit algorithm designs are purely theoretical. Therefore, they have strong regret guarantees, but also are often too conservative in practice. In this work, we pioneer the idea of algorithm design by minimizing the empirical Bayes…

Machine Learning · Computer Science 2020-06-12 Chih-Wei Hsu , Branislav Kveton , Ofer Meshi , Martin Mladenov , Csaba Szepesvari

In this paper, we consider the problem of sleeping bandits with stochastic action sets and adversarial rewards. In this setting, in contrast to most work in bandits, the actions may not be available at all times. For instance, some products…

Machine Learning · Computer Science 2020-08-11 Aadirupa Saha , Pierre Gaillard , Michal Valko

Best-of-both-worlds algorithms for online learning which achieve near-optimal regret in both the adversarial and the stochastic regimes have received growing attention recently. Existing techniques often require careful adaptation to every…

Machine Learning · Computer Science 2023-02-21 Christoph Dann , Chen-Yu Wei , Julian Zimmert
‹ Prev 1 2 3 10 Next ›