Related papers: Data Dependent Convergence for Distributed Stochas…

On Data Dependence in Distributed Stochastic Optimization

We study a distributed consensus-based stochastic gradient descent (SGD) algorithm and show that the rate of convergence involves the spectral properties of two matrices: the standard spectral gap of a weight matrix from the network…

Optimization and Control · Mathematics 2016-09-02 Avleen S. Bijral , Anand D. Sarwate , Nathan Srebro

Data Sampling Affects the Complexity of Online SGD over Dependent Data

Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, practical scenarios often involve a data-generating process that produces highly dependent data…

Machine Learning · Computer Science 2022-04-04 Shaocong Ma , Ziyi Chen , Yi Zhou , Kaiyi Ji , Yingbin Liang

Adaptive Sampling Distributed Stochastic Variance Reduced Gradient for Heterogeneous Distributed Datasets

We study distributed optimization algorithms for minimizing the average of \emph{heterogeneous} functions distributed across several machines with a focus on communication efficiency. In such settings, naively using the classical stochastic…

Machine Learning · Computer Science 2020-11-18 Ilqar Ramazanli , Han Nguyen , Hai Pham , Sashank J. Reddi , Barnabas Poczos

Scaling up Stochastic Gradient Descent for Non-convex Optimisation

Stochastic gradient descent (SGD) is a widely adopted iterative method for optimizing differentiable objective functions. In this paper, we propose and discuss a novel approach to scale up SGD in applications involving non-convex functions…

Machine Learning · Statistics 2022-10-07 Saad Mohamad , Hamad Alamri , Abdelhamid Bouchachia

Gradient Diversity: a Key Ingredient for Scalable Distributed Learning

It has been experimentally observed that distributed implementations of mini-batch stochastic gradient descent (SGD) algorithms exhibit speedup saturation and decaying generalization ability beyond a particular batch-size. In this work, we…

Machine Learning · Computer Science 2018-01-09 Dong Yin , Ashwin Pananjady , Max Lam , Dimitris Papailiopoulos , Kannan Ramchandran , Peter Bartlett

Asynchronous Distributed Semi-Stochastic Gradient Optimization

With the recent proliferation of large-scale learning problems,there have been a lot of interest on distributed machine learning algorithms, particularly those that are based on stochastic gradient descent (SGD) and its variants. However,…

Machine Learning · Computer Science 2015-12-07 Ruiliang Zhang , Shuai Zheng , James T. Kwok

Cooperative SGD with Dynamic Mixing Matrices

One of the most common methods to train machine learning algorithms today is the stochastic gradient descent (SGD). In a distributed setting, SGD-based algorithms have been shown to converge theoretically under specific circumstances. A…

Machine Learning · Computer Science 2025-08-22 Soumya Sarkar , Shweta Jain

Learning from time-dependent streaming data with online stochastic algorithms

This paper addresses stochastic optimization in a streaming setting with time-dependent and biased gradient estimates. We analyze several first-order methods, including Stochastic Gradient Descent (SGD), mini-batch SGD, and time-varying…

Machine Learning · Computer Science 2023-07-20 Antoine Godichon-Baggioni , Nicklas Werge , Olivier Wintenberger

Faster Convergence with Less Communication: Broadcast-Based Subgraph Sampling for Decentralized Learning over Wireless Networks

Consensus-based decentralized stochastic gradient descent (D-SGD) is a widely adopted algorithm for decentralized training of machine learning models across networked agents. A crucial part of D-SGD is the consensus-based model averaging,…

Information Theory · Computer Science 2025-02-12 Daniel Pérez Herrera , Zheng Chen , Erik G. Larsson

Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework

Distributed stochastic gradient descent (SGD) has attracted considerable recent attention due to its potential for scaling computational resources, reducing training time, and helping protect user privacy in machine learning. However, the…

Machine Learning · Computer Science 2025-02-27 Siyuan Yu , Wei Chen , H. Vincent Poor

Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks

We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a…

Numerical Analysis · Computer Science 2015-09-01 N. Denizcan Vanli , Muhammed O. Sayin , Suleyman S. Kozat

Stochastic versus Deterministic in Stochastic Gradient Descent

This paper theoretically reanalyzes the convergence of the mini-batch stochastic gradient descent (SGD) for a structured minimization problem involving a finite-sum function with its gradient being stochastically approximated, and an…

Optimization and Control · Mathematics 2026-04-07 Runze Li , Jintao Xu , Wenxun Xing

Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning

We study the asynchronous stochastic gradient descent algorithm for distributed training over $n$ workers which have varying computation and communication frequency over time. In this algorithm, workers compute stochastic gradients in…

Machine Learning · Computer Science 2022-06-17 Anastasia Koloskova , Sebastian U. Stich , Martin Jaggi

Distributed Delayed Stochastic Optimization

We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to the development of gradient-based distributed optimization…

Optimization and Control · Mathematics 2011-05-02 Alekh Agarwal , John C. Duchi

Data-Dependent Stability of Stochastic Gradient Descent

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for…

Machine Learning · Computer Science 2018-02-19 Ilja Kuzborskij , Christoph H. Lampert

Stochastic Gradient Descent with Adaptive Data

Stochastic gradient descent (SGD) is a powerful optimization technique that is particularly useful in online learning scenarios. Its convergence analysis is relatively well understood under the assumption that the data samples are…

Machine Learning · Computer Science 2024-10-03 Ethan Che , Jing Dong , Xin T. Tong

Guided parallelized stochastic gradient descent for delay compensation

Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its…

Machine Learning · Computer Science 2024-02-13 Anuraganand Sharma

Distributed Stochastic Optimization via Adaptive SGD

Stochastic convex optimization algorithms are the most popular way to train machine learning models on large-scale data. Scaling up the training process of these models is crucial, but the most popular algorithm, Stochastic Gradient Descent…

Machine Learning · Statistics 2018-10-30 Ashok Cutkosky , Robert Busa-Fekete

Online stochastic gradient descent on non-convex losses from high-dimensional inference

Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively…

Machine Learning · Statistics 2023-06-23 Gerard Ben Arous , Reza Gheissari , Aukosh Jagannath

Fully Distributed and Asynchronized Stochastic Gradient Descent for Networked Systems

This paper considers a general data-fitting problem over a networked system, in which many computing nodes are connected by an undirected graph. This kind of problem can find many real-world applications and has been studied extensively in…

Machine Learning · Computer Science 2017-04-14 Ying Zhang