Related papers: Stochastic Gradient Descent for Linear Systems wit…

Stochastic Gradient Descent for Incomplete Tensor Linear Systems

Solving large tensor linear systems poses significant challenges due to the high volume of data stored, and it only becomes more challenging when some of the data is missing. Recently, Ma et al. showed that this problem can be tackled using…

Numerical Analysis · Mathematics 2026-05-28 Anna Ma , Deanna Needell , Alexander Xue

Block-missing data in linear systems: An unbiased stochastic gradient descent approach

Achieving accurate approximations to solutions of large linear systems is crucial, especially when those systems utilize real-world data. A consequence of using real-world data is that there will inevitably be missingness. Current…

Numerical Analysis · Mathematics 2023-10-17 Chelsea Huynh , Anna Ma , Michael Strand

Debiasing Stochastic Gradient Descent to handle missing values

Stochastic gradient algorithm is a key ingredient of many machine learning methods, particularly appropriate for large-scale learning.However, a major caveat of large data is their incompleteness.We propose an averaged stochastic gradient…

Statistics Theory · Mathematics 2020-06-09 Julie Josse , Aude Sportisse , Claire Boyer , Aymeric Dieuleveut

Increasing Missingness to Reduce Bias: Richardson-SGD with Missing Data

Stochastic gradient methods are central to modern large-scale learning, but their use with incomplete covariates remains delicate since imputation schemes generally introduce systematic gradient biases, as shown for linear models. In this…

Machine Learning · Statistics 2026-05-20 Ferdinand Genans , Erwan Scornet

Guided parallelized stochastic gradient descent for delay compensation

Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its…

Machine Learning · Computer Science 2024-02-13 Anuraganand Sharma

Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms

The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-06 Janis Keuper , Franz-Josef Pfreundt

Stochastic Optimization of Large-Scale Parametrized Dynamical Systems

Many relevant problems in the area of systems and control, such as controller synthesis, observer design and model reduction, can be viewed as optimization problems involving dynamical systems: for instance, maximizing performance in the…

Optimization and Control · Mathematics 2023-11-15 Pascal Den Boef , Jos Maubach , Wil Schilders , Nathan van de Wouw

Fully Distributed and Asynchronized Stochastic Gradient Descent for Networked Systems

This paper considers a general data-fitting problem over a networked system, in which many computing nodes are connected by an undirected graph. This kind of problem can find many real-world applications and has been studied extensively in…

Machine Learning · Computer Science 2017-04-14 Ying Zhang

On the Convergence of A Data-Driven Regularized Stochastic Gradient Descent for Nonlinear Ill-Posed Problems

Stochastic gradient descent (SGD) is a promising method for solving large-scale inverse problems, due to its excellent scalability with respect to data size. In this work, we analyze a new data-driven regularized stochastic gradient descent…

Numerical Analysis · Mathematics 2024-09-30 Zehui Zhou

Sparse Linear Regression With Missing Data

This paper proposes a fast and accurate method for sparse regression in the presence of missing data. The underlying statistical model encapsulates the low-dimensional structure of the incomplete data matrix and the sparsity of the…

Machine Learning · Statistics 2015-03-31 Ravi Ganti , Rebecca M. Willett

Stochastic Doubly Robust Gradient

When training a machine learning model with observational data, it is often encountered that some values are systemically missing. Learning from the incomplete data in which the missingness depends on some covariates may lead to biased…

Machine Learning · Computer Science 2018-12-24 Kanghoon Lee , Jihye Choi , Moonsu Cha , Jung-Kwon Lee , Taeyoon Kim

When Does Stochastic Gradient Algorithm Work Well?

In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a…

Machine Learning · Statistics 2018-12-27 Lam M. Nguyen , Nam H. Nguyen , Dzung T. Phan , Jayant R. Kalagnanam , Katya Scheinberg

Distributed Stochastic Gradient Descent Using LDGM Codes

We consider a distributed learning problem in which the computation is carried out on a system consisting of a master node and multiple worker nodes. In such systems, the existence of slow-running machines called stragglers will cause a…

Information Theory · Computer Science 2019-01-16 Shunsuke Horii , Takahiro Yoshida , Manabu Kobayashi , Toshiyasu Matsushima

Statistical Learning and Inverse Problems: A Stochastic Gradient Approach

Inverse problems are paramount in Science and Engineering. In this paper, we consider the setup of Statistical Inverse Problem (SIP) and demonstrate how Stochastic Gradient Descent (SGD) algorithms can be used in the linear SIP setting. We…

Machine Learning · Statistics 2022-11-29 Yuri R. Fonseca , Yuri F. Saporito

Stochastic Gradient Descent Revisited

Stochastic gradient descent (SGD) has been a go-to algorithm for nonconvex stochastic optimization problems arising in machine learning. Its theory however often requires a strong framework to guarantee convergence properties. We hereby…

Optimization and Control · Mathematics 2025-03-11 Azar Louzi

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Stochastic Gradient Descent (SGD) is a central tool in machine learning. We prove that SGD converges to zero loss, even with a fixed (non-vanishing) learning rate - in the special case of homogeneous linear classifiers with smooth monotone…

Machine Learning · Statistics 2022-04-19 Mor Shpigel Nacson , Nathan Srebro , Daniel Soudry

Statistical Inference for Model Parameters in Stochastic Gradient Descent

The stochastic gradient descent (SGD) algorithm has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. While most existing works focus on the convergence of the objective function…

Machine Learning · Statistics 2023-11-02 Xi Chen , Jason D. Lee , Xin T. Tong , Yichen Zhang

An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification

Sparsity regularized loss minimization problems play an important role in various fields including machine learning, data mining, and modern statistics. Proximal gradient descent method and coordinate descent method are the most popular…

Machine Learning · Computer Science 2023-11-13 Runxue Bao , Bin Gu , Heng Huang

Efficient Distributed SGD with Variance Reduction

Stochastic Gradient Descent (SGD) has become one of the most popular optimization methods for training machine learning models on massive datasets. However, SGD suffers from two main drawbacks: (i) The noisy gradient updates have high…

Machine Learning · Computer Science 2017-04-10 Soham De , Tom Goldstein

Adaptive Sequential Optimization with Applications to Machine Learning

A framework is introduced for solving a sequence of slowly changing optimization problems, including those arising in regression and classification applications, using optimization algorithms such as stochastic gradient descent (SGD). The…

Machine Learning · Computer Science 2015-09-25 Craig Wilson , Venugopal V. Veeravalli