English
Related papers

Related papers: Reducing Runtime by Recycling Samples

200 papers

We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors…

Machine Learning · Computer Science 2016-08-06 Reza Babanezhad , Mohamed Osama Ahmed , Alim Virani , Mark Schmidt , Jakub Konečný , Scott Sallinen

Variance-reduced stochastic gradient methods have gained popularity in recent times. Several variants exist with different strategies for the storing and sampling of gradients and this work concerns the interactions between these two…

Optimization and Control · Mathematics 2022-10-19 Martin Morin , Pontus Giselsson

Despite the strong theoretical guarantees that variance-reduced finite-sum optimization algorithms enjoy, their applicability remains limited to cases where the memory overhead they introduce (SAG/SAGA), or the periodic full gradient…

Optimization and Control · Mathematics 2021-03-24 Ayoub El Hanchi , David A. Stephens

We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have…

Machine Learning · Computer Science 2016-01-26 Sashank J. Reddi , Ahmed Hefny , Suvrit Sra , Barnabás Póczos , Alex Smola

Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizer. The existing convergence results…

Machine Learning · Computer Science 2018-02-19 Bicheng Ying , Kun Yuan , Ali H. Sayed

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its slow convergence can be a computational bottleneck. Variance reduction techniques such as SAG, SVRG and SAGA have been proposed to overcome this weakness,…

Machine Learning · Computer Science 2016-02-29 Thomas Hofmann , Aurelien Lucchi , Simon Lacoste-Julien , Brian McWilliams

In this paper we study a family of variance reduction methods with randomized batch size---at each step, the algorithm first randomly chooses the batch size and then selects a batch of samples to conduct a variance-reduced stochastic…

Machine Learning · Computer Science 2018-08-08 Xuanqing Liu , Cho-Jui Hsieh

Uniform sampling of training data has been commonly used in traditional stochastic optimization algorithms such as Proximal Stochastic Gradient Descent (prox-SGD) and Proximal Stochastic Dual Coordinate Ascent (prox-SDCA). Although uniform…

Machine Learning · Statistics 2015-01-05 Peilin Zhao , Tong Zhang

Stochastic optimization algorithms are widely used for machine learning with large-scale data. However, their convergence often suffers from non-vanishing variance. Variance Reduction (VR) methods, such as SVRG and SARAH, address this issue…

Machine Learning · Computer Science 2026-01-12 Daniil Medyakov , Gleb Molodtsov , Savelii Chezhegov , Alexey Rebrikov , Aleksandr Beznosikov

SGD (Stochastic Gradient Descent) is a popular algorithm for large scale optimization problems due to its low iterative cost. However, SGD can not achieve linear convergence rate as FGD (Full Gradient Descent) because of the inherent…

Machine Learning · Computer Science 2017-12-05 Aixiang Chen , Bingchuan Chen , Xiaolong Chai , Rui Bian , Hengguang Li

Stochastic variance reduced gradient (SVRG) is a popular variance reduction technique for accelerating stochastic gradient descent (SGD). We provide a first analysis of the method for solving a class of linear inverse problems in the lens…

Numerical Analysis · Mathematics 2022-01-19 Bangti Jin , Zehui Zhou , Jun Zou

Many problems require to optimize empirical risk functions over large data sets. Gradient descent methods that calculate the full gradient in every descent step do not scale to such datasets. Various flavours of Stochastic Gradient Descent…

Machine Learning · Computer Science 2020-11-26 Francesco Cosentino , Harald Oberhauser , Alessandro Abate

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled \emph{with} replacement. In practice, however, sampling \emph{without} replacement is very common, easier to…

Machine Learning · Computer Science 2016-10-18 Ohad Shamir

Stochastic gradient descent is the method of choice for large-scale machine learning problems, by virtue of its light complexity per iteration. However, it lags behind its non-stochastic counterparts with respect to the convergence rate,…

Machine Learning · Statistics 2016-03-23 Vatsal Shah , Megasthenis Asteris , Anastasios Kyrillidis , Sujay Sanghavi

For many machine learning problems, data is abundant and it may be prohibitive to make multiple passes through the full training set. In this context, we investigate strategies for dynamically increasing the effective sample size, when…

Machine Learning · Computer Science 2016-10-10 Hadi Daneshmand , Aurelien Lucchi , Thomas Hofmann

Recent stochastic gradient methods that have appeared in the literature base their efficiency and global convergence properties on a suitable control of the variance of the gradient batch estimate. This control is typically achieved by…

Optimization and Control · Mathematics 2025-06-11 Marco Boresta , Alberto De Santis , Stefano Lucidi

The stochastic gradient descent (SGD) method is a widely used approach for solving stochastic optimization problems, but its convergence is typically slow. Existing variance reduction techniques, such as SAGA, improve convergence by…

Optimization and Control · Mathematics 2025-11-21 Fabio Nobile , Matteo Raviola , Nathan Schaeffer

Variance reduced stochastic gradient (SGD) methods converge significantly faster than the vanilla SGD counterpart. However, these methods are not very practical on large scale problems, as they either i) require frequent passes over the…

Optimization and Control · Mathematics 2018-10-17 Anant Raj , Sebastian U. Stich

We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs). We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent…

Machine Learning · Statistics 2015-04-20 Mark Schmidt , Reza Babanezhad , Mohamed Osama Ahmed , Aaron Defazio , Ann Clifton , Anoop Sarkar

With the purpose of examining biased updates in variance-reduced stochastic gradient methods, we introduce SVAG, a SAG/SAGA-like method with adjustable bias. SVAG is analyzed in a cocoercive root-finding setting, a setting which yields the…

Optimization and Control · Mathematics 2022-10-19 Martin Morin , Pontus Giselsson
‹ Prev 1 2 3 10 Next ›