Related papers: Sparse Optimistic Information Directed Sampling

Information Directed Sampling for Sparse Linear Bandits

Stochastic sparse linear bandits offer a practical model for high-dimensional online decision-making problems and have a rich information-regret structure. In this work we explore the use of information-directed sampling (IDS), which…

Machine Learning · Statistics 2021-06-01 Botao Hao , Tor Lattimore , Wei Deng

Optimistic Information Directed Sampling

We study the problem of online learning in contextual bandit problems where the loss function is assumed to belong to a known parametric function class. We propose a new analytic framework for this setting that bridges the Bayesian theory…

Machine Learning · Computer Science 2024-06-28 Gergely Neu , Matteo Papini , Ludovic Schwartz

Asymptotically Optimal Information-Directed Sampling

We introduce a simple and efficient algorithm for stochastic linear bandits with finitely many actions that is asymptotically optimal and (nearly) worst-case optimal in finite time. The approach is based on the frequentist…

Machine Learning · Statistics 2021-07-05 Johannes Kirschner , Tor Lattimore , Claire Vernade , Csaba Szepesvári

Learning to Optimize via Information-Directed Sampling

We propose information-directed sampling -- a new approach to online optimization problems in which a decision-maker must balance between exploration and exploitation while learning from partial feedback. Each action is sampled in a manner…

Machine Learning · Computer Science 2017-07-10 Daniel Russo , Benjamin Van Roy

Information Directed Sampling for Linear Partial Monitoring

Partial monitoring is a rich framework for sequential decision making under uncertainty that generalizes many well known bandit models, including linear, combinatorial and dueling bandits. We introduce information directed sampling (IDS)…

Machine Learning · Statistics 2020-02-27 Johannes Kirschner , Tor Lattimore , Andreas Krause

Regret Bounds for Information-Directed Reinforcement Learning

Information-directed sampling (IDS) has revealed its potential as a data-efficient algorithm for reinforcement learning (RL). However, theoretical understanding of IDS for Markov Decision Processes (MDPs) is still limited. We develop novel…

Machine Learning · Computer Science 2022-11-28 Botao Hao , Tor Lattimore

Empirical Bound Information-Directed Sampling for Norm-Agnostic Bandits

Information-directed sampling (IDS) is a powerful framework for solving bandit problems which has shown strong results in both Bayesian and frequentist settings. However, frequentist IDS, like many other bandit algorithms, requires that one…

Machine Learning · Statistics 2025-03-10 Piotr M. Suder , Eric Laber

Information-directed sampling for bandits: a primer

The Multi-Armed Bandit problem provides a fundamental framework for analyzing the tension between exploration and exploitation in sequential learning. This paper explores Information Directed Sampling (IDS) policies, a class of heuristics…

Machine Learning · Computer Science 2025-12-24 Annika Hirling , Giorgio Nicoletti , Antonio Celani

Information Directed Sampling and Bandits with Heteroscedastic Noise

In the stochastic bandit problem, the goal is to maximize an unknown function via a sequence of noisy evaluations. Typically, the observation noise is assumed to be independent of the evaluation point and to satisfy a tail bound uniformly…

Machine Learning · Statistics 2018-04-20 Johannes Kirschner , Andreas Krause

Information Directed Sampling for Stochastic Bandits with Graph Feedback

We consider stochastic multi-armed bandit problems with graph feedback, where the decision maker is allowed to observe the neighboring actions of the chosen action. We allow the graph structure to vary with time and consider both…

Machine Learning · Computer Science 2017-11-10 Fang Liu , Swapna Buccapatnam , Ness Shroff

Bayesian Reinforcement Learning via Deep, Sparse Sampling

We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance…

Machine Learning · Computer Science 2020-06-30 Divya Grover , Debabrota Basu , Christos Dimitrakakis

Contextual Information-Directed Sampling

Information-directed sampling (IDS) has recently demonstrated its potential as a data-efficient reinforcement learning algorithm. However, it is still unclear what is the right form of information ratio to optimize when contextual…

Machine Learning · Computer Science 2022-06-10 Botao Hao , Tor Lattimore , Chao Qin

Adaptation to Easy Data in Prediction with Limited Advice

We derive an online learning algorithm with improved regret guarantees for `easy' loss sequences. We consider two types of `easiness': (a) stochastic loss sequences and (b) adversarial loss sequences with small effective range of the…

Machine Learning · Computer Science 2019-08-28 Tobias Sommer Thune , Yevgeny Seldin

On the optimal regret of collaborative personalized linear bandits

Stochastic linear bandits are a fundamental model for sequential decision making, where an agent selects a vector-valued action and receives a noisy reward with expected value given by an unknown linear function. Although well studied in…

Machine Learning · Computer Science 2025-06-23 Bruce Huang , Ruida Zhou , Lin F. Yang , Suhas Diggavi

Learning to Sparsify Stochastic Linear Bandits

This paper addresses the problem of learning to sparsify stochastic linear bandits, where a decision-maker sequentially selects actions from a high-dimensional space subject to a sparsity constraint on the number of nonzero elements in the…

Machine Learning · Computer Science 2026-05-12 Zhengmiao Wang , Ming Chi , Zhi-Wei Liu , Lintao Ye , Carla Fabiana Chiasserini

Information-Based Optimal Subdata Selection for Big Data Linear Regression

Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in big data analysis is…

Methodology · Statistics 2019-06-27 HaiYing Wang , Min Yang , John Stufken

Best of many worlds: Robust model selection for online supervised learning

We introduce algorithms for online, full-information prediction that are competitive with contextual tree experts of unknown complexity, in both probabilistic and adversarial settings. We show that by incorporating a probabilistic framework…

Machine Learning · Computer Science 2018-05-23 Vidya Muthukumar , Mitas Ray , Anant Sahai , Peter L. Bartlett

Order Optimal Regret Bounds for Sharpe Ratio Optimization under Thompson Sampling

In this paper, we study sequential decision-making for maximizing the Sharpe ratio (SR) in a stochastic multi-armed bandit (MAB) setting. Unlike standard bandit formulations that maximize cumulative reward, SR optimization requires…

Machine Learning · Computer Science 2026-04-02 Mohammad Taha Shah , Sabrina Khurshid , Gourab Ghatak

Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

This paper investigates regret minimization, statistical inference, and their interplay in high-dimensional online decision-making based on the sparse linear context bandit model. We integrate the $\varepsilon$-greedy bandit algorithm for…

Machine Learning · Computer Science 2025-05-20 Congyuan Duan , Wanteng Ma , Jiashuo Jiang , Dong Xia

On Adaptivity in Information-constrained Online Learning

We study how to adapt to smoothly-varying ('easy') environments in well-known online learning problems where acquiring information is expensive. For the problem of label efficient prediction, which is a budgeted version of prediction with…

Machine Learning · Computer Science 2019-12-09 Siddharth Mitra , Aditya Gopalan