Related papers: Efficient improper learning for online logistic re…

Logarithmic Regret for Adversarial Online Control

We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances. Existing regret bounds for this setting scale as $\sqrt{T}$ unless strong stochastic assumptions are imposed on the…

Machine Learning · Computer Science 2020-06-24 Dylan J. Foster , Max Simchowitz

Online Inverse Linear Optimization: Efficient Logarithmic-Regret Algorithm, Robustness to Suboptimality, and Lower Bound

In online inverse linear optimization, a learner observes time-varying sets of feasible actions and an agent's optimal actions, selected by solving linear optimization over the feasible actions. The learner sequentially makes predictions of…

Machine Learning · Computer Science 2025-05-23 Shinsaku Sakaue , Taira Tsuchiya , Han Bao , Taihei Oki

Learning The Best Expert Efficiently

We consider online learning problems where the aim is to achieve regret which is efficient in the sense that it is the same order as the lowest regret amongst K experts. This is a substantially stronger requirement that achieving…

Machine Learning · Computer Science 2019-11-12 Daron Anderson , Douglas J. Leith

Logarithmic Regret for Online Control

We study optimal regret bounds for control in linear dynamical systems under adversarially changing strongly convex cost functions, given the knowledge of transition dynamics. This includes several well studied and fundamental frameworks…

Machine Learning · Computer Science 2019-09-12 Naman Agarwal , Elad Hazan , Karan Singh

Online Isotonic Regression

We consider the online version of the isotonic regression problem. Given a set of linearly ordered points (e.g., on the real line), the learner must predict labels sequentially at adversarially chosen positions and is evaluated by her total…

Machine Learning · Computer Science 2016-10-10 Wojciech Kotłowski , Wouter M. Koolen , Alan Malek

Logistic Regression: The Importance of Being Improper

Learning linear predictors with the logistic loss---both in stochastic and online settings---is a fundamental task in machine learning and statistics, with direct connections to classification and boosting. Existing "fast rates" for this…

Machine Learning · Computer Science 2018-12-17 Dylan J. Foster , Satyen Kale , Haipeng Luo , Mehryar Mohri , Karthik Sridharan

Logarithmic Regret and Polynomial Scaling in Online Multi-step-ahead Prediction

This letter studies the problem of online multi-step-ahead prediction for unknown linear stochastic systems. Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of…

Machine Learning · Computer Science 2025-11-18 Jiachen Qian , Yang Zheng

Achieving Better Local Regret Bound for Online Non-Convex Bilevel Optimization

Online bilevel optimization (OBO) has emerged as a powerful framework for many machine learning problems. Prior works have developed several algorithms that minimize the standard bilevel local regret or the window-averaged bilevel local…

Machine Learning · Computer Science 2026-05-12 Tingkai Jia , Haiguang Wang , Cheng Chen

Constrained Online Two-stage Stochastic Optimization: Near Optimal Algorithms via Adversarial Learning

We consider an online two-stage stochastic optimization with long-term constraints over a finite horizon of $T$ periods. At each period, we take the first-stage action, observe a model parameter realization and then take the second-stage…

Machine Learning · Computer Science 2024-05-21 Jiashuo Jiang

The Interplay Between Stability and Regret in Online Learning

This paper considers the stability of online learning algorithms and its implications for learnability (bounded regret). We introduce a novel quantity called {\em forward regret} that intuitively measures how good an online learning…

Machine Learning · Computer Science 2012-11-28 Ankan Saha , Prateek Jain , Ambuj Tewari

Towards minimax policies for online linear optimization with bandit feedback

We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of order $\sqrt{d n \log N}$ for any finite action set with $N$…

Machine Learning · Computer Science 2012-02-15 Sébastien Bubeck , Nicolò Cesa-Bianchi , Sham M. Kakade

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

We study regret minimization in a stochastic multi-armed bandit setting and establish a fundamental trade-off between the regret suffered under an algorithm, and its statistical robustness. Considering broad classes of underlying arms'…

Machine Learning · Computer Science 2020-06-23 Kumar Ashutosh , Jayakrishnan Nair , Anmol Kagrecha , Krishna Jagannathan

Recursive Exponential Weighting for Online Non-convex Optimization

In this paper, we investigate the online non-convex optimization problem which generalizes the classic {online convex optimization problem by relaxing the convexity assumption on the cost function. For this type of problem, the classic…

Machine Learning · Computer Science 2017-09-14 Lin Yang , Cheng Tan , Wing Shing Wong

Efficient Logistic Regression with Mixture of Sigmoids

This paper studies the Exponential Weights (EW) algorithm with an isotropic Gaussian prior for online logistic regression. We show that the near-optimal worst-case regret bound $O(d\log(Bn))$ for EW, established by Kakade and Ng (2005)…

Machine Learning · Computer Science 2026-04-06 Federico Di Gennaro , Saptarshi Chakraborty , Nikita Zhivotovskiy

Logistic Regression Regret: What's the Catch?

We address the problem of the achievable regret rates with online logistic regression. We derive lower bounds with logarithmic regret under $L_1$, $L_2$, and $L_\infty$ constraints on the parameter values. The bounds are dominated by $d/2…

Machine Learning · Computer Science 2020-02-20 Gil I. Shamir

Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving,…

Machine Learning · Computer Science 2014-05-22 H. Brendan McMahan , Francesco Orabona

Regret Balancing for Bandit and RL Model Selection

We consider model selection in stochastic bandit and reinforcement learning problems. Given a set of base learning algorithms, an effective model selection strategy adapts to the best learning algorithm in an online fashion. We show that by…

Machine Learning · Computer Science 2020-06-11 Yasin Abbasi-Yadkori , Aldo Pacchiano , My Phan

High-Probability Risk Bounds via Sequential Predictors

Online learning methods yield sequential regret bounds under minimal assumptions and provide in-expectation risk bounds for statistical learning. However, despite the apparent advantage of online guarantees over their statistical…

Machine Learning · Computer Science 2023-08-16 Dirk van der Hoeven , Nikita Zhivotovskiy , Nicolò Cesa-Bianchi

Efficient Methods for Non-stationary Online Learning

Non-stationary online learning has drawn much attention in recent years. In particular, dynamic regret and adaptive regret are proposed as two principled performance measures for online convex optimization in non-stationary environments. To…

Machine Learning · Computer Science 2025-09-10 Peng Zhao , Yan-Feng Xie , Lijun Zhang , Zhi-Hua Zhou

Precise Regret Bounds for Log-loss via a Truncated Bayesian Algorithm

We study the sequential general online regression, known also as the sequential probability assignments, under logarithmic loss when compared against a broad class of experts. We focus on obtaining tight, often matching, lower and upper…

Machine Learning · Computer Science 2023-02-02 Changlong Wu , Mohsen Heidari , Ananth Grama , Wojciech Szpankowski