Related papers: Non-Uniform Stochastic Average Gradient Method for…

Compositional Stochastic Average Gradient for Machine Learning and Related Applications

Many machine learning, statistical inference, and portfolio optimization problems require minimization of a composition of expected value functions (CEVF). Of particular interest is the finite-sum versions of such compositional optimization…

Machine Learning · Computer Science 2018-09-10 Tsung-Yu Hsieh , Yasser EL-Manzalawy , Yiwei Sun , Vasant Honavar

Stochastic Average Gradient : A Simple Empirical Investigation

Despite the recent growth of theoretical studies and empirical successes of neural networks, gradient backpropagation is still the most widely used algorithm for training such networks. On the one hand, we have deterministic or full…

Machine Learning · Computer Science 2023-10-20 Pascal Junior Tikeng Notsawo

A Novel Stochastic Stratified Average Gradient Method: Convergence Rate and Its Complexity

SGD (Stochastic Gradient Descent) is a popular algorithm for large scale optimization problems due to its low iterative cost. However, SGD can not achieve linear convergence rate as FGD (Full Gradient Descent) because of the inherent…

Machine Learning · Computer Science 2017-12-05 Aixiang Chen , Bingchuan Chen , Xiaolong Chai , Rui Bian , Hengguang Li

Training Conditional Random Fields with Natural Gradient Descent

We propose a novel parameter estimation procedure that works efficiently for conditional random fields (CRF). This algorithm is an extension to the maximum likelihood estimation (MLE), using loss functions defined by Bregman divergences…

Machine Learning · Computer Science 2015-08-11 Yuan Cao

Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming

In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems.…

Optimization and Control · Mathematics 2015-10-27 Saeed Ghadimi , Guanghui Lan

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields

This work investigates the training of conditional random fields (CRFs) via the stochastic dual coordinate ascent (SDCA) algorithm of Shalev-Shwartz and Zhang (2016). SDCA enjoys a linear convergence rate and a strong empirical performance…

Machine Learning · Statistics 2018-07-11 Rémi Le Priol , Alexandre Piché , Simon Lacoste-Julien

Learning Theory of the SVRG: Generalization and Convergence Analysis

Variance reduction (VR) methods employ stochastic gradients with decreasing variance, and they have been widely applied to solve large-scale optimization problems in machine learning because of their efficiency. Existing theoretical studies…

Machine Learning · Computer Science 2026-05-28 Yunwen Lei , Zimeng Wang , Xiaoming Yuan

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

We propose a stochastic conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. Existing CGM variants for this template either suffer from slow convergence rates, or…

Machine Learning · Computer Science 2022-04-19 Gideon Dresdner , Maria-Luiza Vladarean , Gunnar Rätsch , Francesco Locatello , Volkan Cevher , Alp Yurtsever

On Biased Stochastic Gradient Estimation

We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we…

Optimization and Control · Mathematics 2020-02-28 Derek Driggs , Jingwei Liang , Carola-Bibiane Schönlieb

Convergence Analysis of Stochastic Accelerated Gradient Methods for Generalized Smooth Optimizations

We investigate the Randomized Stochastic Accelerated Gradient (RSAG) method, utilizing either constant or adaptive step sizes, for stochastic optimization problems with generalized smooth objective functions. Under relaxed affine variance…

Optimization and Control · Mathematics 2025-02-25 Chenhao Yu , Yusu Hong , Junhong Lin

Minimizing Finite Sums with the Stochastic Average Gradient

We propose the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in…

Optimization and Control · Mathematics 2016-05-12 Mark Schmidt , Nicolas Le Roux , Francis Bach

Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling

Stochastic optimization algorithms are widely used for machine learning with large-scale data. However, their convergence often suffers from non-vanishing variance. Variance Reduction (VR) methods, such as SVRG and SARAH, address this issue…

Machine Learning · Computer Science 2026-01-12 Daniil Medyakov , Gleb Molodtsov , Savelii Chezhegov , Alexey Rebrikov , Aleksandr Beznosikov

An Adaptive Incremental Gradient Method With Support for Non-Euclidean Norms

Stochastic variance reduced methods have shown strong performance in solving finite-sum problems. However, these methods usually require the users to manually tune the step-size, which is time-consuming or even infeasible for some…

Optimization and Control · Mathematics 2023-10-10 Binghui Xie , Chenhan Jin , Kaiwen Zhou , James Cheng , Wei Meng

Stochastic smoothing accelerated gradient method for general constrained nonsmooth convex composite optimization

We propose a novel stochastic smoothing accelerated gradient (SSAG) method for general constrained nonsmooth convex composite optimization, and analyze the convergence rates. The SSAG method allows various smoothing techniques, and can deal…

Optimization and Control · Mathematics 2026-02-03 Ruyu Wang , Chao Zhang

Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled \emph{with} replacement. In practice, however, sampling \emph{without} replacement is very common, easier to…

Machine Learning · Computer Science 2016-10-18 Ohad Shamir

Sampling and Update Frequencies in Proximal Variance-Reduced Stochastic Gradient Methods

Variance-reduced stochastic gradient methods have gained popularity in recent times. Several variants exist with different strategies for the storing and sampling of gradients and this work concerns the interactions between these two…

Optimization and Control · Mathematics 2022-10-19 Martin Morin , Pontus Giselsson

Stochastic Reweighted Gradient Descent

Despite the strong theoretical guarantees that variance-reduced finite-sum optimization algorithms enjoy, their applicability remains limited to cases where the memory overhead they introduce (SAG/SAGA), or the periodic full gradient…

Optimization and Control · Mathematics 2021-03-24 Ayoub El Hanchi , David A. Stephens

CSG: A stochastic gradient method for a wide class of optimization problems appearing in a machine learning or data-driven context

A recent article introduced thecontinuous stochastic gradient method (CSG) for the efficient solution of a class of stochastic optimization problems. While the applicability of known stochastic gradient type methods is typically limited to…

Optimization and Control · Mathematics 2021-11-16 Lukas Pflug , Max Grieshammer , Andrian Uihlein , Michael Stingl

Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling

Conditional Random Fields (CRFs) constitute a popular and efficient approach for supervised sequence labelling. CRFs can cope with large description spaces and can integrate some form of structural dependency between labels. In this…

Machine Learning · Computer Science 2015-05-14 Nataliya Sokolovska , Thomas Lavergne , Olivier Cappé , François Yvon

Variance-Reduced Stochastic Learning under Random Reshuffling

Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizer. The existing convergence results…

Machine Learning · Computer Science 2018-02-19 Bicheng Ying , Kun Yuan , Ali H. Sayed