Related papers: A note on continuous-time online learning

A continuous-time approach to online optimization

We consider a family of learning strategies for online optimization problems that evolve in continuous time and we show that they lead to no regret. From a more traditional, discrete-time viewpoint, this continuous-time approach allows us…

Optimization and Control · Mathematics 2014-02-28 Joon Kwon , Panayotis Mertikopoulos

Efficient Optimal Learning for Contextual Bandits

We address the problem of learning in an online setting where the learner repeatedly observes features, selects among a set of actions, and receives reward for the action taken. We provide the first efficient algorithm with an optimal…

Machine Learning · Computer Science 2011-06-17 Miroslav Dudik , Daniel Hsu , Satyen Kale , Nikos Karampatziakis , John Langford , Lev Reyzin , Tong Zhang

Online Linear Quadratic Tracking with Regret Guarantees

Online learning algorithms for dynamical systems provide finite time guarantees for control in the presence of sequentially revealed cost functions. We pose the classical linear quadratic tracking problem in the framework of online…

Systems and Control · Electrical Eng. & Systems 2024-10-18 Aren Karapetyan , Diego Bolliger , Anastasios Tsiamis , Efe C. Balta , John Lygeros

Continuous Prediction with Experts' Advice

Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and…

Machine Learning · Computer Science 2022-10-04 Victor Sanches Portella , Christopher Liaw , Nicholas J. A. Harvey

Online Learning with Off-Policy Feedback

We study the problem of online learning in adversarial bandit problems under a partial observability model called off-policy feedback. In this sequential decision making problem, the learner cannot directly observe its rewards, but instead…

Machine Learning · Computer Science 2022-07-20 Germano Gabbianelli , Matteo Papini , Gergely Neu

An Online Learning Approach to Optimizing Time-Varying Costs of AoI

We consider systems that require timely monitoring of sources over a communication network, where the cost of delayed information is unknown, time-varying and possibly adversarial. For the single source monitoring problem, we design…

Networking and Internet Architecture · Computer Science 2021-05-31 Vishrant Tripathi , Eytan Modiano

Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret

Online learning algorithms are designed to learn even when their input is generated by an adversary. The widely-accepted formal definition of an online algorithm's ability to learn is the game-theoretic notion of regret. We argue that the…

Machine Learning · Computer Science 2012-07-03 Raman Arora , Ofer Dekel , Ambuj Tewari

Online Improper Learning with an Approximation Oracle

We revisit the question of reducing online learning to approximate optimization of the offline problem. In this setting, we give two algorithms with near-optimal performance in the full information setting: they guarantee optimal regret and…

Machine Learning · Computer Science 2018-04-24 Elad Hazan , Wei Hu , Yuanzhi Li , Zhiyuan Li

Non-stochastic Bandits With Evolving Observations

We introduce a novel online learning framework that unifies and generalizes pre-established models, such as delayed and corrupted feedback, to encompass adversarial environments where action feedback evolves over time. In this setting, the…

Machine Learning · Computer Science 2024-05-28 Yogev Bar-On , Yishay Mansour

Online Learning with Predictable Sequences

We present methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences. Specifically if the sequence encountered by the learner is described well by a known "predictable process", the algorithms…

Machine Learning · Statistics 2014-05-27 Alexander Rakhlin , Karthik Sridharan

Online Optimization : Competing with Dynamic Comparators

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop…

Machine Learning · Computer Science 2015-01-27 Ali Jadbabaie , Alexander Rakhlin , Shahin Shahrampour , Karthik Sridharan

Bandits with Partially Observable Confounded Data

We study linear contextual bandits with access to a large, confounded, offline dataset that was sampled from some fixed policy. We show that this problem is closely related to a variant of the bandit problem with side information. We…

Machine Learning · Computer Science 2021-08-11 Guy Tennenholtz , Uri Shalit , Shie Mannor , Yonathan Efroni

Online Learning in the Random Order Model

In the random-order model for online learning, the sequence of losses is chosen upfront by an adversary and presented to the learner after a random permutation. Any random-order input is \emph{asymptotically} equivalent to a stochastic…

Machine Learning · Computer Science 2025-10-06 Martino Bernasconi , Andrea Celli , Riccardo Colini-Baldeschi , Federico Fusco , Stefano Leonardi , Matteo Russo

Adversarial Bandit Optimization with Globally Bounded Perturbations to Linear Losses

We study a class of adversarial bandit optimization problems in which the loss functions may be non-convex and non-smooth. In each round, the learner observes a loss that consists of an underlying linear component together with an…

Machine Learning · Computer Science 2026-03-30 Zhuoyu Cheng , Kohei Hatano , Eiji Takimoto

Online Mixed Discrete and Continuous Optimization: Algorithms, Regret Analysis and Applications

We study an online mixed discrete and continuous optimization problem where a decision maker interacts with an unknown environment for a number of $T$ rounds. At each round, the decision maker needs to first jointly choose a discrete and a…

Optimization and Control · Mathematics 2024-08-27 Lintao Ye , Ming Chi , Zhi-Wei Liu , Xiaoling Wang , Vijay Gupta

Online Learning with Composite Loss Functions

We study a new class of online learning problems where each of the online algorithm's actions is assigned an adversarial value, and the loss of the algorithm at each step is a known and deterministic function of the values assigned to its…

Machine Learning · Computer Science 2014-05-20 Ofer Dekel , Jian Ding , Tomer Koren , Yuval Peres

A closer look at temporal variability in dynamic online learning

This work focuses on the setting of dynamic regret in the context of online learning with full information. In particular, we analyze regret bounds with respect to the temporal variability of the loss functions. By assuming that the…

Machine Learning · Computer Science 2021-02-16 Nicolò Campolongo , Francesco Orabona

Online Stochastic Linear Optimization under One-bit Feedback

In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round. This problem has found many applications including online advertisement…

Machine Learning · Computer Science 2015-09-28 Lijun Zhang , Tianbao Yang , Rong Jin , Zhi-Hua Zhou

Online Learning with Imperfect Hints

We consider a variant of the classical online linear optimization problem in which at every step, the online player receives a "hint" vector before choosing the action for that round. Rather surprisingly, it was shown that if the hint…

Machine Learning · Computer Science 2020-10-05 Aditya Bhaskara , Ashok Cutkosky , Ravi Kumar , Manish Purohit

Second-Order Non-Stationary Online Learning for Regression

The goal of a learner, in standard online learning, is to have the cumulative loss not much larger compared with the best-performing function from some fixed class. Numerous algorithms were shown to have this gap arbitrarily close to zero,…

Machine Learning · Computer Science 2013-03-04 Nina Vaits , Edward Moroshko , Koby Crammer