Related papers: Continuous-time Models for Stochastic Optimization…

A systematic approach to Lyapunov analyses of continuous-time models in convex optimization

First-order methods are often analyzed via their continuous-time models, where their worst-case convergence properties are usually approached via Lyapunov functions. In this work, we provide a systematic and principled approach to find and…

Numerical Analysis · Mathematics 2024-03-12 Céline Moucer , Adrien Taylor , Francis Bach

Convergence Analysis of Continuous-Time Distributed Stochastic Gradient Algorithms

In this paper, we propose a new framework to study distributed optimization problems with stochastic gradients by employing a multi-agent system with continuous-time dynamics. Here the goal of the agents is to cooperatively minimize the sum…

Systems and Control · Electrical Eng. & Systems 2026-02-10 Jianhua Sun , Kaihong Lu , Xin Yu

Convergence Rates of Two-Time-Scale Gradient Descent-Ascent Dynamics for Solving Nonconvex Min-Max Problems

There are much recent interests in solving noncovnex min-max optimization problems due to its broad applications in many areas including machine learning, networked resource allocations, and distributed optimization. Perhaps, the most…

Optimization and Control · Mathematics 2021-12-20 Thinh T. Doan

Continuous-time Lower Bounds for Gradient-based Algorithms

This article derives lower bounds on the convergence rate of continuous-time gradient-based optimization algorithms. The algorithms are subjected to a time-normalization constraint that avoids a reparametrization of time in order to make…

Optimization and Control · Mathematics 2020-08-04 Michael Muehlebach , Michael I. Jordan

Learning from time-dependent streaming data with online stochastic algorithms

This paper addresses stochastic optimization in a streaming setting with time-dependent and biased gradient estimates. We analyze several first-order methods, including Stochastic Gradient Descent (SGD), mini-batch SGD, and time-varying…

Machine Learning · Computer Science 2023-07-20 Antoine Godichon-Baggioni , Nicklas Werge , Olivier Wintenberger

Losing momentum in continuous-time stochastic optimisation

The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become…

Optimization and Control · Mathematics 2024-11-06 Kexin Jin , Jonas Latz , Chenguang Liu , Alessandro Scagliotti

First and Second Order Approximations to Stochastic Gradient Descent Methods with Momentum Terms

Stochastic Gradient Descent (SGD) methods see many uses in optimization problems. Modifications to the algorithm, such as momentum-based SGD methods have been known to produce better results in certain cases. Much of this, however, is due…

Machine Learning · Computer Science 2025-04-22 Eric Lu

A general system of differential equations to model first order adaptive algorithms

First order optimization algorithms play a major role in large scale machine learning. A new class of methods, called adaptive algorithms, were recently introduced to adjust iteratively the learning rate for each coordinate. Despite great…

Machine Learning · Computer Science 2019-10-01 André Belotto da Silva , Maxime Gazeau

Stochastic Inertial Dynamics Via Time Scaling and Averaging

Our work is part of the close link between continuous-time dissipative dynamical systems and optimization algorithms, and more precisely here, in the stochastic setting. We aim to study stochastic convex minimization problems through the…

Optimization and Control · Mathematics 2025-02-21 Rodrigo Maulen-Soto , Jalal Fadili , Hedy Attouch , Peter Ochs

A Unified Convergence Analysis of First Order Convex Optimization Methods via Strong Lyapunov Functions

We present a unified convergence analysis for first order convex optimization methods using the concept of strong Lyapunov conditions. Combining this with suitable time scaling factors, we are able to handle both convex and strong convex…

Optimization and Control · Mathematics 2021-08-03 Long Chen , Hao Luo

Continuized Nesterov Acceleration for Non-Convex Optimization

In convex optimization, continuous-time counterparts have been a fruitful tool for analyzing momentum algorithms. Fewer such examples are available when the function to minimize is non-convex. In several cases, discrepancies arise between…

Optimization and Control · Mathematics 2026-01-07 Julien Hermant , Jean-François Aujol , Charles Dossal , Lorick Huang , Aude Rondepierre

Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement

Gradient optimization algorithms using epochs, that is those based on stochastic gradient descent without replacement (SGDo), are predominantly used to train machine learning models in practice. However, the mathematical theory of SGDo and…

Machine Learning · Computer Science 2025-12-05 Stefan Perko

Stochastic Nonconvex Optimization with Large Minibatches

We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large…

Machine Learning · Computer Science 2019-03-12 Weiran Wang , Nathan Srebro

Time-Average Stochastic Optimization with Non-convex Decision Set and its Convergence

This paper considers time-average stochastic optimization, where a time average decision vector, an average of decision vectors chosen in every time step from a time-varying (possibly non-convex) set, minimizes a convex objective function…

Optimization and Control · Mathematics 2015-01-29 Sucha Supittayapornpong , Michael J. Neely

Nesterov's method with decreasing learning rate leads to accelerated stochastic gradient descent

We present a coupled system of ODEs which, when discretized with a constant time step/learning rate, recovers Nesterov's accelerated gradient descent algorithm. The same ODEs, when discretized with a decreasing learning rate, leads to novel…

Optimization and Control · Mathematics 2020-09-02 Maxime Laborde , Adam M. Oberman

Randomised Splitting Methods and Stochastic Gradient Descent

We explore an explicit link between stochastic gradient descent using common batching strategies and splitting methods for ordinary differential equations. From this perspective, we introduce a new minibatching strategy (called Symmetric…

Optimization and Control · Mathematics 2025-04-08 Luke Shaw , Peter A. Whalley

A Continuous-time Stochastic Gradient Descent Method for Continuous Data

Optimization problems with continuous data appear in, e.g., robust machine learning, functional data analysis, and variational inference. Here, the target function is given as an integral over a family of (continuously) indexed target…

Machine Learning · Computer Science 2023-11-01 Kexin Jin , Jonas Latz , Chenguang Liu , Carola-Bibiane Schönlieb

Continuous and discrete-time accelerated methods for an inequality constrained convex optimization problem

This paper is devoted to the study of acceleration methods for an inequality constrained convex optimization problem by using Lyapunov functions. We first approximate such a problem as an unconstrained optimization problem by employing the…

Optimization and Control · Mathematics 2024-11-25 Juan Liu , Nan-Jing Huang , Xian-Jun Long , Xue-song Li

Bias-Optimal Bounds for SGD: A Computer-Aided Lyapunov Analysis

The non-asymptotic analysis of Stochastic Gradient Descent (SGD) typically yields bounds that decompose into a bias term and a variance term. In this work, we focus on the bias component and study the extent to which SGD can match the…

Optimization and Control · Mathematics 2026-02-02 Daniel Cortild , Lucas Ketels , Juan Peypouquet , Guillaume Garrigos

Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching

We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays.…

Optimization and Control · Mathematics 2025-06-23 Amit Attia , Ofir Gaash , Tomer Koren