Related papers: Accelerating Variance-Reduced Stochastic Gradient …

Direct Acceleration of SAGA using Sampled Negative Momentum

Variance reduction is a simple and effective technique that accelerates convex (or non-convex) stochastic optimization. Among existing variance reduction methods, SVRG and SAGA adopt unbiased gradient estimators and are the most popular…

Machine Learning · Computer Science 2019-04-23 Kaiwen Zhou

On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants

We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have…

Machine Learning · Computer Science 2016-01-26 Sashank J. Reddi , Ahmed Hefny , Suvrit Sra , Barnabás Póczos , Alex Smola

Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

Recently, research on accelerated stochastic gradient descent methods (e.g., SVRG) has made exciting progress (e.g., linear convergence for strongly convex problems). However, the best-known methods (e.g., Katyusha) requires at least two…

Machine Learning · Computer Science 2017-04-18 Fanhua Shang , Yuanyuan Liu , James Cheng , Jiacheng Zhuo

Gradient correlation is a key ingredient to accelerate SGD with momentum

Empirically, it has been observed that adding momentum to Stochastic Gradient Descent (SGD) accelerates the convergence of the algorithm. However, the literature has been rather pessimistic, even in the case of convex functions, about the…

Optimization and Control · Mathematics 2025-01-27 Julien Hermant , Marien Renaud , Jean-François Aujol , Charles Dossal , Aude Rondepierre

Stochastic Variance Reduction for Nonconvex Optimization

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient…

Optimization and Control · Mathematics 2016-04-06 Sashank J. Reddi , Ahmed Hefny , Suvrit Sra , Barnabas Poczos , Alex Smola

VFOG: Variance-Reduced Fast Optimistic Gradient Methods for a Class of Nonmonotone Generalized Equations

We develop a novel optimistic gradient-type algorithmic framework, combining both Nesterov's acceleration and variance-reduction techniques, to solve a class of generalized equations involving possibly nonmonotone operators in data-driven…

Optimization and Control · Mathematics 2025-08-26 Quoc Tran-Dinh , Nghia Nguyen-Trung

ASVRG: Accelerated Proximal SVRG

This paper proposes an accelerated proximal stochastic variance reduced gradient (ASVRG) method, in which we design a simple and effective momentum acceleration trick. Unlike most existing accelerated stochastic variance reduction methods…

Machine Learning · Computer Science 2018-11-20 Fanhua Shang , Licheng Jiao , Kaiwen Zhou , James Cheng , Yan Ren , Yufei Jin

Nesterov Accelerated Shuffling Gradient Method for Convex Optimization

In this paper, we propose Nesterov Accelerated Shuffling Gradient (NASG), a new algorithm for the convex finite-sum minimization problems. Our method integrates the traditional Nesterov's acceleration momentum with different shuffling…

Optimization and Control · Mathematics 2022-06-14 Trang H. Tran , Katya Scheinberg , Lam M. Nguyen

Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence

Based on SGD, previous works have proposed many algorithms that have improved convergence speed and generalization in stochastic optimization, such as SGDm, AdaGrad, Adam, etc. However, their convergence analysis under non-convex conditions…

Machine Learning · Computer Science 2024-02-05 Yichuan Deng , Zhao Song , Chiwun Yang

An Analysis of Stochastic Variance Reduced Gradient for Linear Inverse Problems

Stochastic variance reduced gradient (SVRG) is a popular variance reduction technique for accelerating stochastic gradient descent (SGD). We provide a first analysis of the method for solving a class of linear inverse problems in the lens…

Numerical Analysis · Mathematics 2022-01-19 Bangti Jin , Zehui Zhou , Jun Zou

Estimate Sequences for Variance-Reduced Stochastic Composite Optimization

In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. This point of view covers the stochastic gradient…

Machine Learning · Statistics 2019-05-08 Andrei Kulunchakov , Julien Mairal

Stop Wasting My Gradients: Practical SVRG

We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors…

Machine Learning · Computer Science 2016-08-06 Reza Babanezhad , Mohamed Osama Ahmed , Alim Virani , Mark Schmidt , Jakub Konečný , Scott Sallinen

Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise

In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. More precisely, we interpret a large class of…

Machine Learning · Statistics 2020-09-07 Andrei Kulunchakov , Julien Mairal

Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling

Stochastic optimization algorithms are widely used for machine learning with large-scale data. However, their convergence often suffers from non-vanishing variance. Variance Reduction (VR) methods, such as SVRG and SARAH, address this issue…

Machine Learning · Computer Science 2026-01-12 Daniil Medyakov , Gleb Molodtsov , Savelii Chezhegov , Alexey Rebrikov , Aleksandr Beznosikov

Does Momentum Help? A Sample Complexity Analysis

Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization. While benefits of such acceleration ideas in deterministic settings are well understood, their…

Machine Learning · Computer Science 2022-07-12 Swetha Ganesh , Rohan Deb , Gugan Thoppe , Amarjit Budhiraja

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

Stochastic momentum methods have been widely adopted in training deep neural networks. However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored. This…

Machine Learning · Computer Science 2018-08-31 Yan Yan , Tianbao Yang , Zhe Li , Qihang Lin , Yi Yang

Learning Theory of the SVRG: Generalization and Convergence Analysis

Variance reduction (VR) methods employ stochastic gradients with decreasing variance, and they have been widely applied to solve large-scale optimization problems in machine learning because of their efficiency. Existing theoretical studies…

Machine Learning · Computer Science 2026-05-28 Yunwen Lei , Zimeng Wang , Xiaoming Yuan

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed…

Machine Learning · Statistics 2019-01-03 Courtney Paquette , Hongzhou Lin , Dmitriy Drusvyatskiy , Julien Mairal , Zaid Harchaoui

Stochastic Nested Variance Reduction for Nonconvex Optimization

We study finite-sum nonconvex optimization problems, where the objective function is an average of $n$ nonconvex functions. We propose a new stochastic gradient descent algorithm based on nested variance reduction. Compared with…

Machine Learning · Computer Science 2020-10-20 Dongruo Zhou , Pan Xu , Quanquan Gu

Convergence Analysis of Accelerated Stochastic Gradient Descent under the Growth Condition

We study the convergence of accelerated stochastic gradient descent for strongly convex objectives under the growth condition, which states that the variance of stochastic gradient is bounded by a multiplicative part that grows with the…

Optimization and Control · Mathematics 2023-11-01 You-Lin Chen , Sen Na , Mladen Kolar