Related papers: Continuous-time Models for Stochastic Optimization…
First-order methods are often analyzed via their continuous-time models, where their worst-case convergence properties are usually approached via Lyapunov functions. In this work, we provide a systematic and principled approach to find and…
In this paper, we propose a new framework to study distributed optimization problems with stochastic gradients by employing a multi-agent system with continuous-time dynamics. Here the goal of the agents is to cooperatively minimize the sum…
There are much recent interests in solving noncovnex min-max optimization problems due to its broad applications in many areas including machine learning, networked resource allocations, and distributed optimization. Perhaps, the most…
This article derives lower bounds on the convergence rate of continuous-time gradient-based optimization algorithms. The algorithms are subjected to a time-normalization constraint that avoids a reparametrization of time in order to make…
This paper addresses stochastic optimization in a streaming setting with time-dependent and biased gradient estimates. We analyze several first-order methods, including Stochastic Gradient Descent (SGD), mini-batch SGD, and time-varying…
The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become…
Stochastic Gradient Descent (SGD) methods see many uses in optimization problems. Modifications to the algorithm, such as momentum-based SGD methods have been known to produce better results in certain cases. Much of this, however, is due…
First order optimization algorithms play a major role in large scale machine learning. A new class of methods, called adaptive algorithms, were recently introduced to adjust iteratively the learning rate for each coordinate. Despite great…
Our work is part of the close link between continuous-time dissipative dynamical systems and optimization algorithms, and more precisely here, in the stochastic setting. We aim to study stochastic convex minimization problems through the…
We present a unified convergence analysis for first order convex optimization methods using the concept of strong Lyapunov conditions. Combining this with suitable time scaling factors, we are able to handle both convex and strong convex…
In convex optimization, continuous-time counterparts have been a fruitful tool for analyzing momentum algorithms. Fewer such examples are available when the function to minimize is non-convex. In several cases, discrepancies arise between…
Gradient optimization algorithms using epochs, that is those based on stochastic gradient descent without replacement (SGDo), are predominantly used to train machine learning models in practice. However, the mathematical theory of SGDo and…
We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large…
This paper considers time-average stochastic optimization, where a time average decision vector, an average of decision vectors chosen in every time step from a time-varying (possibly non-convex) set, minimizes a convex objective function…
We present a coupled system of ODEs which, when discretized with a constant time step/learning rate, recovers Nesterov's accelerated gradient descent algorithm. The same ODEs, when discretized with a decreasing learning rate, leads to novel…
We explore an explicit link between stochastic gradient descent using common batching strategies and splitting methods for ordinary differential equations. From this perspective, we introduce a new minibatching strategy (called Symmetric…
Optimization problems with continuous data appear in, e.g., robust machine learning, functional data analysis, and variational inference. Here, the target function is given as an integral over a family of (continuously) indexed target…
This paper is devoted to the study of acceleration methods for an inequality constrained convex optimization problem by using Lyapunov functions. We first approximate such a problem as an unconstrained optimization problem by employing the…
The non-asymptotic analysis of Stochastic Gradient Descent (SGD) typically yields bounds that decompose into a bias term and a variance term. In this work, we focus on the bias component and study the extent to which SGD can match the…
We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays.…