Related papers: Communication-Efficient Approximate Gradient Codin…

Leveraging partial stragglers within gradient coding

Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of…

Information Theory · Computer Science 2024-11-19 Aditya Ramamoorthy , Ruoyu Meng , Vrinda S. Girimaji

Approximate Gradient Coding with Optimal Decoding

In distributed optimization problems, a technique called gradient coding, which involves replicating data points, has been used to mitigate the effect of straggling machines. Recent work has studied approximate gradient coding, which…

Machine Learning · Statistics 2021-08-09 Margalit Glasgow , Mary Wootters

Communication-Efficient Approximate Gradient Coding for Distributed Learning in Heterogeneous Systems

We propose a communication-efficient optimally structured gradient coding scheme to jointly address straggler resilience and communication efficiency in heterogeneous distributed learning. By establishing a unified framework that…

Systems and Control · Electrical Eng. & Systems 2026-05-18 Heekang Song , Wan Choi

Communication-Efficient Gradient Coding for Straggler Mitigation in Distributed Learning

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, need to overcome two limitations: delays caused by slow running machines called 'stragglers', and…

Information Theory · Computer Science 2020-05-15 Swanand Kadhe , O. Ozan Koyluoglu , Kannan Ramchandran

Gradient Coding with Clustering and Multi-message Communication

Gradient descent (GD) methods are commonly employed in machine learning problems to optimize the parameters of the model in an iterative fashion. For problems with massive datasets, computations are distributed to many parallel computing…

Information Theory · Computer Science 2019-03-06 Emre Ozfatura , Deniz Gunduz , Sennur Ulukus

Optimization-based Block Coordinate Gradient Coding for Mitigating Partial Stragglers in Distributed Learning

Gradient coding schemes effectively mitigate full stragglers in distributed learning by introducing identical redundancy in coded local partial derivatives corresponding to all model parameters. However, they are no longer effective for…

Information Theory · Computer Science 2023-04-26 Qi Wang , Ying Cui , Chenglin Li , Junni Zou , Hongkai Xiong

Gradient Coding from Cyclic MDS Codes and Expander Graphs

Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing…

Information Theory · Computer Science 2019-07-09 Netanel Raviv , Itzhak Tamo , Rashish Tandon , Alexandros G. Dimakis

Stochastic Gradient Coding for Straggler Mitigation in Distributed Learning

We consider distributed gradient descent in the presence of stragglers. Recent work on \em gradient coding \em and \em approximate gradient coding \em have shown how to add redundancy in distributed gradient descent to guarantee convergence…

Information Theory · Computer Science 2019-05-15 Rawad Bitar , Mary Wootters , Salim El Rouayheb

Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, suffer from slow running machines, called 'stragglers'. Gradient coding is a coding-theoretic framework to…

Information Theory · Computer Science 2019-05-01 Swanand Kadhe , O. Ozan Koyluoglu , Kannan Ramchandran

LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning

Gradient-based distributed learning in Parameter Server (PS) computing architectures is subject to random delays due to straggling worker nodes, as well as to possible communication bottlenecks between PS and workers. Solutions have been…

Information Theory · Computer Science 2020-04-09 Jingjing Zhang , Osvaldo Simeone

Approximate Gradient Coding via Sparse Random Graphs

Distributed algorithms are often beset by the straggler effect, where the slowest compute nodes in the system dictate the overall running time. Coding-theoretic techniques have been recently proposed to mitigate stragglers via algorithmic…

Machine Learning · Statistics 2017-11-21 Zachary Charles , Dimitris Papailiopoulos , Jordan Ellenberg

Distributed Learning based on 1-Bit Gradient Coding in the Presence of Stragglers

This paper considers the problem of distributed learning (DL) in the presence of stragglers. For this problem, DL methods based on gradient coding have been widely investigated, which redundantly distribute the training data to the workers…

Machine Learning · Computer Science 2024-03-25 Chengxi Li , Mikael Skoglund

Communication-Computation Efficient Gradient Coding

This paper develops coding techniques to reduce the running time of distributed learning tasks. It characterizes the fundamental tradeoff to compute gradients (and more generally vector summations) in terms of three parameters: computation…

Machine Learning · Statistics 2018-02-13 Min Ye , Emmanuel Abbe

Fundamental Limits of Approximate Gradient Coding

It has been established that when the gradient coding problem is distributed among $n$ servers, the computation load (number of stored data partitions) of each worker is at least $s+1$ in order to resists $s$ stragglers. This scheme incurs…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-25 Sinong Wang , Jiashang Liu , Ness Shroff

Nested Gradient Codes for Straggler Mitigation in Distributed Machine Learning

We consider distributed learning in the presence of slow and unresponsive worker nodes, referred to as stragglers. In order to mitigate the effect of stragglers, gradient coding redundantly assigns partial computations to the worker such…

Information Theory · Computer Science 2022-12-19 Luis Maßny , Christoph Hofmeister , Maximilian Egger , Rawad Bitar , Antonia Wachter-Zeh

Optimal Communication-Computation Trade-Off in Heterogeneous Gradient Coding

Gradient coding allows a master node to derive the aggregate of the partial gradients, calculated by some worker nodes over the local data sets, with minimum communication cost, and in the presence of stragglers. In this paper, for gradient…

Information Theory · Computer Science 2021-03-03 Tayyebeh Jahani-Nezhad , Mohammad Ali Maddah-Ali

Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers

In this paper, we propose an optimally structured gradient coding scheme to mitigate the straggler problem in distributed learning. Conventional gradient coding methods often assume homogeneous straggler models or rely on excessive data…

Systems and Control · Electrical Eng. & Systems 2025-10-28 Heekang Song , Wan Choi

Optimization-based Block Coordinate Gradient Coding

Existing gradient coding schemes introduce identical redundancy across the coordinates of gradients and hence cannot fully utilize the computation results from partial stragglers. This motivates the introduction of diverse redundancies…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-21 Qi Wang , Ying Cui , Chenglin Li , Junni Zou , Hongkai Xiong

Sequential Gradient Coding For Straggler Mitigation

In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation…

Machine Learning · Computer Science 2023-06-29 M. Nikhil Krishnan , MohammadReza Ebrahimi , Ashish Khisti

Communication-Efficient Distributed Learning via Sparse and Adaptive Stochastic Gradient

Gradient-based optimization methods implemented on distributed computing architectures are increasingly used to tackle large-scale machine learning applications. A key bottleneck in such distributed systems is the high communication…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-11 Xiaoge Deng , Dongsheng Li , Tao Sun , Xicheng Lu