English
Related papers

Related papers: Gradient Coding in Decentralized Learning for Evad…

200 papers

In this paper, we propose an optimally structured gradient coding scheme to mitigate the straggler problem in distributed learning. Conventional gradient coding methods often assume homogeneous straggler models or rely on excessive data…

Systems and Control · Electrical Eng. & Systems 2025-10-28 Heekang Song , Wan Choi

This paper considers the problem of distributed learning (DL) in the presence of stragglers. For this problem, DL methods based on gradient coding have been widely investigated, which redundantly distribute the training data to the workers…

Machine Learning · Computer Science 2024-03-25 Chengxi Li , Mikael Skoglund

We propose a novel coding theoretic framework for mitigating stragglers in distributed learning. We show how carefully replicating data blocks and coding across gradients can provide tolerance to failures and stragglers for Synchronous…

Machine Learning · Statistics 2017-03-09 Rashish Tandon , Qi Lei , Alexandros G. Dimakis , Nikos Karampatziakis

In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation…

Machine Learning · Computer Science 2023-06-29 M. Nikhil Krishnan , MohammadReza Ebrahimi , Ashish Khisti

We consider distributed gradient descent in the presence of stragglers. Recent work on \em gradient coding \em and \em approximate gradient coding \em have shown how to add redundancy in distributed gradient descent to guarantee convergence…

Information Theory · Computer Science 2019-05-15 Rawad Bitar , Mary Wootters , Salim El Rouayheb

Gradient coding schemes effectively mitigate full stragglers in distributed learning by introducing identical redundancy in coded local partial derivatives corresponding to all model parameters. However, they are no longer effective for…

Information Theory · Computer Science 2023-04-26 Qi Wang , Ying Cui , Chenglin Li , Junni Zou , Hongkai Xiong

In this paper, we consider a large network containing many regions such that each region is equipped with a worker with some data processing and communication capability. For such a network, some workers may become stragglers due to the…

Systems and Control · Electrical Eng. & Systems 2022-04-14 Elie Atallah , Nazanin Rahnavard , Qiyu Sun

Distributed implementations are crucial in speeding up large scale machine learning applications. Distributed gradient descent (GD) is widely employed to parallelize the learning task by distributing the dataset across multiple workers. A…

Information Theory · Computer Science 2021-03-02 Baturalp Buyukates , Emre Ozfatura , Sennur Ulukus , Deniz Gunduz

In distributed optimization problems, a technique called gradient coding, which involves replicating data points, has been used to mitigate the effect of straggling machines. Recent work has studied approximate gradient coding, which…

Machine Learning · Statistics 2021-08-09 Margalit Glasgow , Mary Wootters

In distributed synchronous gradient descent (GD) the main performance bottleneck for the per-iteration completion time is the slowest \textit{straggling} workers. To speed up GD iterations in the presence of stragglers, coded distributed…

Information Theory · Computer Science 2020-11-04 Baturalp Buyukates , Emre Ozfatura , Sennur Ulukus , Deniz Gunduz

Recent developments and emerging use cases, such as smart Internet of Things (IoT) and Edge AI, have sparked considerable interest in the training of neural networks over fully decentralized (serverless) networks. One of the major…

Machine Learning · Computer Science 2025-01-30 Eunjeong Jeong , Marios Kountouris

We consider a distributed learning problem in which the computation is carried out on a system consisting of a master node and multiple worker nodes. In such systems, the existence of slow-running machines called stragglers will cause a…

Information Theory · Computer Science 2019-01-16 Shunsuke Horii , Takahiro Yoshida , Manabu Kobayashi , Toshiyasu Matsushima

Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing…

Information Theory · Computer Science 2019-07-09 Netanel Raviv , Itzhak Tamo , Rashish Tandon , Alexandros G. Dimakis

Gradient descent algorithms are widely used in machine learning. In order to deal with huge volume of data, we consider the implementation of gradient descent algorithms in a distributed computing setting where multiple workers compute the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-29 Haozhao Wang , Song Guo , Bin Tang , Ruixuan Li , Chengjie Li

We consider distributed learning in the presence of slow and unresponsive worker nodes, referred to as stragglers. In order to mitigate the effect of stragglers, gradient coding redundantly assigns partial computations to the worker such…

Information Theory · Computer Science 2022-12-19 Luis Maßny , Christoph Hofmeister , Maximilian Egger , Rawad Bitar , Antonia Wachter-Zeh

Gradient descent and its many variants, including mini-batch stochastic gradient descent, form the algorithmic foundation of modern large-scale machine learning. Due to the size and scale of modern data, gradient computations are often…

Machine Learning · Statistics 2018-05-29 Zachary Charles , Dimitris Papailiopoulos

Edge computing has recently emerged as a promising paradigm to boost the performance of distributed learning by leveraging the distributed resources at edge nodes. Architecturally, the introduction of edge nodes adds an additional…

Networking and Internet Architecture · Computer Science 2024-06-18 Weiheng Tang , Jingyi Li , Lin Chen , Xu Chen

Gradient descent (GD) methods are commonly employed in machine learning problems to optimize the parameters of the model in an iterative fashion. For problems with massive datasets, computations are distributed to many parallel computing…

Information Theory · Computer Science 2019-03-06 Emre Ozfatura , Deniz Gunduz , Sennur Ulukus

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, need to overcome two limitations: delays caused by slow running machines called 'stragglers', and…

Information Theory · Computer Science 2020-05-15 Swanand Kadhe , O. Ozan Koyluoglu , Kannan Ramchandran

This paper focuses on mitigating the impact of stragglers in distributed learning system. Unlike the existing results designed for a fixed number of stragglers, we developed a new scheme called Adaptive Gradient Coding(AGC) with flexible…

Information Theory · Computer Science 2021-10-20 Hankun Cao , Qifa Yan , Xiaohu Tang , Guojun Han
‹ Prev 1 2 3 10 Next ›