Related papers: Average-case Acceleration Through Spectral Density…

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods. Despite these algorithms being widely applied to arbitrary loss…

Disordered Systems and Neural Networks · Physics 2021-10-29 Stefano Sarao Mannelli , Pierfrancesco Urbani

Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime

The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results. In exchange, this analysis requires a more precise hypothesis over the…

Optimization and Control · Mathematics 2022-06-23 Leonardo Cunha , Gauthier Gidel , Fabian Pedregosa , Damien Scieur , Courtney Paquette

Average-case optimization analysis for distributed consensus algorithms on regular graphs

The consensus problem in distributed computing involves a network of agents aiming to compute the average of their initial vectors through local communication, represented by an undirected graph. This paper focuses on the studying of this…

Optimization and Control · Mathematics 2024-11-26 Nhat Trung Nguyen , Alexander Rogozin , Alexander Gasnikov

Acceleration Methods

This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested…

Optimization and Control · Mathematics 2024-09-26 Alexandre d'Aspremont , Damien Scieur , Adrien Taylor

Superposition of Random Plane Waves in High Spatial Dimensions: Random Matrix Approach to Landscape Complexity

Motivated by current interest in understanding statistical properties of random landscapes in high-dimensional spaces, we consider a model of the landscape in $\mathbb{R}^N$ obtained by superimposing $M>N$ plane waves of random wavevectors…

Statistical Mechanics · Physics 2022-09-14 Bertrand Lacroix-A-Chez-Toine , Sirio Belga Fedeli , Yan V. Fyodorov

Regularized Nonlinear Acceleration

We describe a convergence acceleration technique for unconstrained optimization problems. Our scheme computes estimates of the optimum from a nonlinear average of the iterates produced by any optimization method. The weights in this average…

Optimization and Control · Mathematics 2019-04-16 Damien Scieur , Alexandre d'Aspremont , Francis Bach

Nonlinear Acceleration of Momentum and Primal-Dual Algorithms

We describe convergence acceleration schemes for multistep optimization algorithms. The extrapolated solution is written as a nonlinear average of the iterates produced by the original optimization method. Our analysis does not need the…

Optimization and Control · Mathematics 2019-10-18 Raghu Bollapragada , Damien Scieur , Alexandre d'Aspremont

Super-Acceleration with Cyclical Step-sizes

We develop a convergence-rate analysis of momentum with cyclical step-sizes. We show that under some assumption on the spectral gap of Hessians in machine learning, cyclical step-sizes are provably faster than constant step-sizes. More…

Optimization and Control · Mathematics 2022-05-10 Baptiste Goujaud , Damien Scieur , Aymeric Dieuleveut , Adrien Taylor , Fabian Pedregosa

Accelerated gradient descent method for functionals of probability measures by new convexity and smoothness based on transport maps

We consider problems of minimizing functionals $\mathcal{F}$ of probability measures on the Euclidean space. To propose an accelerated gradient descent algorithm for such problems, we consider gradient flow of transport maps that give…

Optimization and Control · Mathematics 2023-09-06 Ken'ichiro Tanaka

A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization

Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$. It…

Optimization and Control · Mathematics 2020-03-03 Zhize Li , Jian Li

Accelerating Stochastic Gradient Descent For Least Squares Regression

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error…

Machine Learning · Statistics 2018-08-02 Prateek Jain , Sham M. Kakade , Rahul Kidambi , Praneeth Netrapalli , Aaron Sidford

Accelerated Flow for Probability Distributions

This paper presents a methodology and numerical algorithms for constructing accelerated gradient flows on the space of probability distributions. In particular, we extend the recent variational formulation of accelerated gradient methods in…

Machine Learning · Computer Science 2019-01-14 Amirhossein Taghvaei , Prashant G. Mehta

On Adapting Nesterov's Scheme to Accelerate Iterative Methods for Linear Problems

Nesterov's well-known scheme for accelerating gradient descent in convex optimization problems is adapted to accelerating stationary iterative solvers for linear systems. Compared with classical Krylov subspace acceleration methods, the…

Optimization and Control · Mathematics 2021-08-10 Tao Hong , Irad Yavneh

Distributed Accelerated Projection-Based Consensus Decomposition

With the development of machine learning and Big Data, the concepts of linear and non-linear optimization techniques are becoming increasingly valuable for many quantitative disciplines. Problems of that nature are typically solved using…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-21 Wiktor Maj

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models

We analyze a class of stochastic gradient algorithms with momentum on a high-dimensional random least squares problem. Our framework, inspired by random matrix theory, provides an exact (deterministic) characterization for the sequence of…

Optimization and Control · Mathematics 2021-10-27 Courtney Paquette , Elliot Paquette

Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth Problems

While many distributed optimization algorithms have been proposed for solving smooth or convex problems over the networks, few of them can handle non-convex and non-smooth problems. Based on a proximal primal-dual approach, this paper…

Optimization and Control · Mathematics 2021-09-01 Zhiguo Wang , Jiawei Zhang , Tsung-Hui Chang , Jian Li , Zhi-Quan Luo

Generalized Framework for Nonlinear Acceleration

Nonlinear acceleration algorithms improve the performance of iterative methods, such as gradient descent, using the information contained in past iterates. However, their efficiency is still not entirely understood even in the quadratic…

Optimization and Control · Mathematics 2019-03-22 Damien Scieur

Estimation of the population spectral distribution from a large dimensional sample covariance matrix

This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Marcenko-Pastur equation, originally…

Methodology · Statistics 2013-02-05 Weiming Li , Jiaqi Chen , Yingli Qin , Jianfeng Yao , Zhidong Bai

Provable Acceleration for Diffusion Models under Minimal Assumptions

Score-based diffusion models, while achieving minimax optimality for sampling, are often hampered by slow sampling speeds due to the high computational burden of score function evaluations. Despite the recent remarkable empirical advances…

Machine Learning · Computer Science 2025-02-27 Gen Li , Changxiao Cai

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

We study the algorithmic problem of estimating the mean of heavy-tailed random vector in $\mathbb{R}^d$, given $n$ i.i.d. samples. The goal is to design an efficient estimator that attains the optimal sub-gaussian error bound, only assuming…

Statistics Theory · Mathematics 2020-02-19 Zhixian Lei , Kyle Luh , Prayaag Venkat , Fred Zhang