Related papers: Asynchronous Distributed Optimization with Stochas…

Asynchronous Distributed Semi-Stochastic Gradient Optimization

With the recent proliferation of large-scale learning problems,there have been a lot of interest on distributed machine learning algorithms, particularly those that are based on stochastic gradient descent (SGD) and its variants. However,…

Machine Learning · Computer Science 2015-12-07 Ruiliang Zhang , Shuai Zheng , James T. Kwok

Distributed stochastic optimization with large delays

One of the most widely used methods for solving large-scale stochastic optimization problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from parallelizing stochastic gradient descent…

Optimization and Control · Mathematics 2021-07-08 Zhengyuan Zhou , Panayotis Mertikopoulos , Nicholas Bambos , Peter W. Glynn , Yinyu Ye

Asynchronous Stochastic Optimization Robust to Arbitrary Delays

We consider stochastic optimization with delayed gradients where, at each time step $t$, the algorithm makes an update using a stale stochastic gradient from step $t - d_t$ for some arbitrary delay $d_t$. This setting abstracts asynchronous…

Optimization and Control · Mathematics 2021-11-16 Alon Cohen , Amit Daniely , Yoel Drori , Tomer Koren , Mariano Schain

Dual-Delayed Asynchronous SGD for Arbitrarily Heterogeneous Data

We consider the distributed learning problem with data dispersed across multiple workers under the orchestration of a central server. Asynchronous Stochastic Gradient Descent (SGD) has been widely explored in such a setting to reduce the…

Machine Learning · Computer Science 2024-05-28 Xiaolu Wang , Yuchang Sun , Hoi-To Wai , Jun Zhang

On the Convergence Analysis of Asynchronous SGD for Solving Consistent Linear Systems

In the realm of big data and machine learning, data-parallel, distributed stochastic algorithms have drawn significant attention in the present days.~While the synchronous versions of these algorithms are well understood in terms of their…

Optimization and Control · Mathematics 2020-04-07 Atal Narayan Sahu , Aritra Dutta , Aashutosh Tiwari , Peter Richtárik

Asynchronous and Parallel Distributed Pose Graph Optimization

We present Asynchronous Stochastic Parallel Pose Graph Optimization (ASAPP), the first asynchronous algorithm for distributed pose graph optimization (PGO) in multi-robot simultaneous localization and mapping. By enabling robots to optimize…

Optimization and Control · Mathematics 2023-07-03 Yulun Tian , Alec Koppel , Amrit Singh Bedi , Jonathan P. How

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Most commonly used distributed machine learning systems are either synchronous or centralized asynchronous. Synchronous algorithms like AllReduce-SGD perform poorly in a heterogeneous environment, while asynchronous algorithms using a…

Optimization and Control · Mathematics 2018-09-26 Xiangru Lian , Wei Zhang , Ce Zhang , Ji Liu

Delay-agnostic Asynchronous Distributed Optimization

Existing asynchronous distributed optimization algorithms often use diminishing step-sizes that cause slow practical convergence, or fixed step-sizes that depend on an assumed upper bound of delays. Not only is such a delay bound hard to…

Optimization and Control · Mathematics 2023-08-24 Xuyang Wu , Changxin Liu , Sindri Magnusson , Mikael Johansson

Asynchronous and Stochastic Distributed Resource Allocation

This work proposes and studies the distributed resource allocation problem in asynchronous and stochastic settings. We consider a distributed system with multiple workers and a coordinating server with heterogeneous computation and…

Optimization and Control · Mathematics 2025-09-03 Qiang Li , Michal Yemini , Hoi-To Wai

Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics

In distributed stochastic optimization, where parallel and asynchronous methods are employed, we establish optimal time complexities under virtually any computation behavior of workers/devices/CPUs/GPUs, capturing potential disconnections…

Optimization and Control · Mathematics 2025-02-07 Alexander Tyurin

On the Optimal Time Complexities in Decentralized Stochastic Asynchronous Optimization

We consider the decentralized stochastic asynchronous optimization setup, where many workers asynchronously calculate stochastic gradients and asynchronously communicate with each other using edges in a multigraph. For both homogeneous and…

Optimization and Control · Mathematics 2024-11-05 Alexander Tyurin , Peter Richtárik

Convergence Analysis of Decentralized ASGD

Over the last decades, Stochastic Gradient Descent (SGD) has been intensively studied by the Machine Learning community. Despite its versatility and excellent performance, the optimization of large models via SGD still is a time-consuming…

Machine Learning · Computer Science 2025-12-01 Mauro DL Tosi , Martin Theobald

Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity

Asynchronous stochastic gradient descent (ASGD) is a standard way to exploit heterogeneous compute resources in distributed learning: instead of forcing fast workers to wait for slow ones, the server updates the model whenever a gradient…

Machine Learning · Computer Science 2026-05-14 Ammar Mahran , Artavazd Maranjyan , Peter Richtárik

On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants

We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have…

Machine Learning · Computer Science 2016-01-26 Sashank J. Reddi , Ahmed Hefny , Suvrit Sra , Barnabás Póczos , Alex Smola

Advances in Asynchronous Parallel and Distributed Optimization

Motivated by large-scale optimization problems arising in the context of machine learning, there have been several advances in the study of asynchronous parallel and distributed optimization methods during the past decade. Asynchronous…

Machine Learning · Computer Science 2020-06-25 Mahmoud Assran , Arda Aytekin , Hamid Feyzmahdavian , Mikael Johansson , Michael Rabbat

Asynchronous Distributed Optimization with Delay-free Parameters

Existing asynchronous distributed optimization algorithms often use diminishing step-sizes that cause slow practical convergence, or use fixed step-sizes that depend on and decrease with an upper bound of the delays. Not only are such delay…

Optimization and Control · Mathematics 2024-11-08 Xuyang Wu , Changxin Liu , Sindri Magnusson , Mikael Johansson

On Unbounded Delays in Asynchronous Parallel Fixed-Point Algorithms

The need for scalable numerical solutions has motivated the development of asynchronous parallel algorithms, where a set of nodes run in parallel with little or no synchronization, thus computing with delayed information. This paper studies…

Optimization and Control · Mathematics 2017-08-18 Robert Hannah , Wotao Yin

Distributed Delayed Stochastic Optimization

We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to the development of gradient-based distributed optimization…

Optimization and Control · Mathematics 2011-05-02 Alekh Agarwal , John C. Duchi

Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity

Asynchronous stochastic gradient methods are central to scalable distributed optimization, particularly when devices differ in computational capabilities. Such settings arise naturally in federated learning, where training takes place on…

Optimization and Control · Mathematics 2026-02-20 Artavazd Maranjyan , Peter Richtárik

AsGrad: A Sharp Unified Analysis of Asynchronous-SGD Algorithms

We analyze asynchronous-type algorithms for distributed SGD in the heterogeneous setting, where each worker has its own computation and communication speeds, as well as data distribution. In these algorithms, workers compute possibly stale…

Machine Learning · Computer Science 2023-11-01 Rustem Islamov , Mher Safaryan , Dan Alistarh