Related papers: Adaptive Gradient Methods with Local Guarantees

Less Regret via Online Conditioning

We analyze and evaluate an online gradient descent algorithm with adaptive per-coordinate adjustment of learning rates. Our algorithm can be thought of as an online version of batch gradient descent with a diagonal preconditioner. This…

Machine Learning · Computer Science 2010-02-26 Matthew Streeter , H. Brendan McMahan

Proximal Online Gradient is Optimum for Dynamic Regret

In online learning, the dynamic regret metric chooses the reference (optimal) solution that may change over time, while the typical (static) regret metric assumes the reference solution to be constant over the whole time horizon. The…

Machine Learning · Computer Science 2019-09-04 Yawei Zhao , Shuang Qiu , Ji Liu

Gradient descent revisited via an adaptive online learning rate

Any gradient descent optimization requires to choose a learning rate. With deeper and deeper models, tuning that learning rate can easily become tedious and does not necessarily lead to an ideal convergence. We propose a variation of the…

Machine Learning · Statistics 2018-04-10 Mathieu Ravaut , Satya Gorti

Dynamic Regret of Adaptive Gradient Methods for Strongly Convex Problems

Adaptive gradient algorithms such as ADAGRAD and its variants have gained popularity in the training of deep neural networks. While many works as for adaptive methods have focused on the static regret as a performance metric to achieve a…

Machine Learning · Computer Science 2022-09-07 Parvin Nazari , Esmaile Khorram

Online Optimization : Competing with Dynamic Comparators

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop…

Machine Learning · Computer Science 2015-01-27 Ali Jadbabaie , Alexander Rakhlin , Shahin Shahrampour , Karthik Sridharan

AutoGD: Automatic Learning Rate Selection for Gradient Descent

The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate…

Machine Learning · Computer Science 2025-10-14 Nikola Surjanovic , Alexandre Bouchard-Côté , Trevor Campbell

A Local Regret in Nonconvex Online Learning

We consider an online learning process to forecast a sequence of outcomes for nonconvex models. A typical measure to evaluate online learning algorithms is regret but such standard definition of regret is intractable for nonconvex models…

Machine Learning · Computer Science 2018-11-30 Sergul Aydore , Lee Dicker , Dean Foster

Differentiable Self-Adaptive Learning Rate

Learning rate adaptation is a popular topic in machine learning. Gradient Descent trains neural nerwork with a fixed learning rate. Learning rate adaptation is proposed to accelerate the training process through adjusting the step size in…

Machine Learning · Computer Science 2022-10-20 Bozhou Chen , Hongzhi Wang , Chenmin Ba

A closer look at temporal variability in dynamic online learning

This work focuses on the setting of dynamic regret in the context of online learning with full information. In particular, we analyze regret bounds with respect to the temporal variability of the loss functions. By assuming that the…

Machine Learning · Computer Science 2021-02-16 Nicolò Campolongo , Francesco Orabona

Discounted Adaptive Online Learning: Towards Better Regularization

We study online learning in adversarial nonstationary environments. Since the future can be very different from the past, a critical challenge is to gracefully forget the history while new data comes in. To formalize this intuition, we…

Machine Learning · Computer Science 2024-06-21 Zhiyu Zhang , David Bombara , Heng Yang

Adaptive Online Learning in Dynamic Environments

In this paper, we study online convex optimization in dynamic environments, and aim to bound the dynamic regret with respect to any sequence of comparators. Existing work have shown that online gradient descent enjoys an…

Machine Learning · Computer Science 2018-10-26 Lijun Zhang , Shiyin Lu , Zhi-Hua Zhou

An Adaptive Gradient Method with Energy and Momentum

We introduce a novel algorithm for gradient-based optimization of stochastic objective functions. The method may be seen as a variant of SGD with momentum equipped with an adaptive learning rate automatically adjusted by an 'energy'…

Optimization and Control · Mathematics 2022-03-24 Hailiang Liu , Xuping Tian

Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization

We address the challenge of estimating the learning rate for adaptive gradient methods used in training deep neural networks. While several learning-rate-free approaches have been proposed, they are typically tailored for steepest descent.…

Machine Learning · Computer Science 2024-01-09 Min-Kook Suh , Seung-Woo Seo

Matrix-Free Preconditioning in Online Learning

We provide an online convex optimization algorithm with regret that interpolates between the regret of an algorithm using an optimal preconditioning matrix and one using a diagonal preconditioning matrix. Our regret bound is never worse…

Machine Learning · Computer Science 2019-05-31 Ashok Cutkosky , Tamas Sarlos

Disentangling Adaptive Gradient Methods from Learning Rates

We investigate several confounding factors in the evaluation of optimization algorithms for deep learning. Primarily, we take a deeper look at how adaptive gradient methods interact with the learning rate schedule, a notoriously…

Machine Learning · Computer Science 2020-02-28 Naman Agarwal , Rohan Anil , Elad Hazan , Tomer Koren , Cyril Zhang

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the…

Machine Learning · Computer Science 2015-11-03 Caglar Gulcehre , Marcin Moczulski , Yoshua Bengio

Improved Online Conformal Prediction via Strongly Adaptive Online Learning

We study the problem of uncertainty quantification via prediction sets, in an online setting where the data distribution may vary arbitrarily over time. Recent work develops online conformal prediction techniques that leverage regret…

Machine Learning · Computer Science 2023-02-16 Aadyot Bhatnagar , Huan Wang , Caiming Xiong , Yu Bai

Online Learning Rate Adaptation with Hypergradient Descent

We introduce a general method for improving the convergence rate of gradient-based optimizers that is easy to implement and works well in practice. We demonstrate the effectiveness of the method in a range of optimization problems by…

Machine Learning · Computer Science 2018-08-23 Atilim Gunes Baydin , Robert Cornish , David Martinez Rubio , Mark Schmidt , Frank Wood

A Robust Adaptive Stochastic Gradient Method for Deep Learning

Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of…

Machine Learning · Computer Science 2017-03-03 Caglar Gulcehre , Jose Sotelo , Marcin Moczulski , Yoshua Bengio

Optimizing Optimizers: Regret-optimal gradient descent algorithms

The need for fast and robust optimization algorithms are of critical importance in all areas of machine learning. This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an…

Machine Learning · Computer Science 2021-01-21 Philippe Casgrain , Anastasis Kratsios