Related papers: NeRCC: Nested-Regression Coded Computing for Resil…

Polynomially Coded Regression: Optimal Straggler Mitigation via Data Encoding

We consider the problem of training a least-squares regression model on a large dataset using gradient descent. The computation is carried out on a distributed system consisting of a master node and multiple worker nodes. Such distributed…

Information Theory · Computer Science 2018-05-28 Songze Li , Seyed Mohammadreza Mousavi Kalan , Qian Yu , Mahdi Soltanolkotabi , A. Salman Avestimehr

Berrut Approximated Coded Computing: Straggler Resistance Beyond Polynomial Computing

One of the major challenges in using distributed learning to train complicated models with large data sets is to deal with stragglers effect. As a solution, coded computation has been recently proposed to efficiently add redundancy to the…

Information Theory · Computer Science 2021-11-02 Tayyebeh Jahani-Nezhad , Mohammad Ali Maddah-Ali

General Coded Computing in a Probabilistic Straggler Regime

Coded computing has demonstrated promising results in addressing straggler resiliency in distributed computing systems. However, most coded computing schemes are designed for exact computation, requiring the number of responding servers to…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-26 Parsa Moradi , Mohammad Ali Maddah-Ali

Nested Gradient Codes for Straggler Mitigation in Distributed Machine Learning

We consider distributed learning in the presence of slow and unresponsive worker nodes, referred to as stragglers. In order to mitigate the effect of stragglers, gradient coding redundantly assigns partial computations to the worker such…

Information Theory · Computer Science 2022-12-19 Luis Maßny , Christoph Hofmeister , Maximilian Egger , Rawad Bitar , Antonia Wachter-Zeh

Flexible Coded Distributed Convolution Computing for Enhanced Straggler Resilience and Numerical Stability in Distributed CNNs

Deploying Convolutional Neural Networks (CNNs) on resource-constrained devices necessitates efficient management of computational resources, often via distributed environments susceptible to latency from straggler nodes. This paper…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-09 Shuo Tan , Rui Liu , Xuesong Han , XianLei Long , Kai Wan , Linqi Song , Yong Li

Hierarchical Coded Computation

Coded computation is a method to mitigate "stragglers" in distributed computing systems through the use of error correction coding that has lately received significant attention. First used in vector-matrix multiplication, the range of…

Information Theory · Computer Science 2018-06-28 Nuwan Ferdinand , Stark Draper

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

Coded computing has emerged as a promising framework for tackling significant challenges in large-scale distributed computing, including the presence of slow, faulty, or compromised servers. In this approach, each worker node processes a…

Machine Learning · Computer Science 2026-03-26 Parsa Moradi , Behrooz Tahmasebi , Mohammad Ali Maddah-Ali

Rank-N-Contrast: Learning Continuous Representations for Regression

Deep regression models typically learn in an end-to-end fashion without explicitly emphasizing a regression-aware representation. Consequently, the learned representations exhibit fragmentation and fail to capture the continuous nature of…

Machine Learning · Computer Science 2023-10-11 Kaiwen Zha , Peng Cao , Jeany Son , Yuzhe Yang , Dina Katabi

Workload Distribution with Rateless Encoding: A Low-Latency Computation Offloading Method within Edge Networks

This paper introduces REDC, a comprehensive strategy for offloading computational tasks within mobile Edge Networks (EN) to Distributed Computing (DC) after Rateless Encoding (RE). Despite the efficiency, reliability, and scalability…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-24 Zhongfu Guo , Xinsheng Ji , Wei You , Yu Zhao , Bai Yi , Lingwei Wang

Evaluating Rank-N-Contrast: Continuous and Robust Representations for Regression

This document is an evaluation of the original "Rank-N-Contrast" (arXiv:2210.01189v2) paper published in 2023. This evaluation is done for academic purposes. Deep regression models often fail to capture the continuous nature of sample…

Machine Learning · Computer Science 2025-06-24 Valentin Six , Alexandre Chidiac , Arkin Worlikar

Hierarchical Coded Matrix Multiplication

In distributed computing systems slow working nodes, known as stragglers, can greatly extend finishing times. Coded computing is a technique that enables straggler-resistant computation. Most coded computing techniques presented to date…

Information Theory · Computer Science 2021-02-02 Shahrzad Kiani , Nuwan Ferdinand , Stark C. Draper

NURD: Negative-Unlabeled Learning for Online Datacenter Straggler Prediction

Datacenters execute large computational jobs, which are composed of smaller tasks. A job completes when all its tasks finish, so stragglers -- rare, yet extremely slow tasks -- are a major impediment to datacenter performance. Accurately…

Machine Learning · Computer Science 2022-08-16 Yi Ding , Avinash Rao , Hyebin Song , Rebecca Willett , Henry Hoffmann

Stochastic Gradient Coding for Straggler Mitigation in Distributed Learning

We consider distributed gradient descent in the presence of stragglers. Recent work on \em gradient coding \em and \em approximate gradient coding \em have shown how to add redundancy in distributed gradient descent to guarantee convergence…

Information Theory · Computer Science 2019-05-15 Rawad Bitar , Mary Wootters , Salim El Rouayheb

Sequential Gradient Coding For Straggler Mitigation

In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation…

Machine Learning · Computer Science 2023-06-29 M. Nikhil Krishnan , MohammadReza Ebrahimi , Ashish Khisti

Efficient and Robust Distributed Matrix Computations via Convolutional Coding

Distributed matrix computations -- matrix-matrix or matrix-vector multiplications -- are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). Much of prior work in this area is (i) either sub-optimal in…

Information Theory · Computer Science 2020-06-03 Anindya B. Das , Aditya Ramamoorthy , Namrata Vaswani

Barycentric Coded Distributed Computing with Flexible Recovery Threshold for Collaborative Mobile Edge Computing

Collaborative mobile edge computing (MEC) has emerged as a promising paradigm to enable low-capability edge nodes to cooperatively execute computation-intensive tasks. However, straggling edge nodes (stragglers) significantly degrade the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-12 Houming Qiu , Kun Zhu , Dusit Niyato , Nguyen Cong Luong , Changyan Yi , Chen Dai

Near-Optimal Straggler Mitigation for Distributed Gradient Methods

Modern learning algorithms use gradient descent updates to train inferential models that best explain data. Scaling these approaches to massive data sizes requires proper distributed gradient descent schemes where distributed worker nodes…

Information Theory · Computer Science 2017-10-30 Songze Li , Seyed Mohammadreza Mousavi Kalan , A. Salman Avestimehr , Mahdi Soltanolkotabi

Approximate Gradient Coding with Optimal Decoding

In distributed optimization problems, a technique called gradient coding, which involves replicating data points, has been used to mitigate the effect of straggling machines. Recent work has studied approximate gradient coding, which…

Machine Learning · Statistics 2021-08-09 Margalit Glasgow , Mary Wootters

Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning

Performance of distributed optimization and learning systems is bottlenecked by "straggler" nodes and slow communication links, which significantly delay computation. We propose a distributed optimization framework where the dataset is…

Machine Learning · Statistics 2018-03-15 Can Karakus , Yifan Sun , Suhas Diggavi , Wotao Yin

Straggler-resistant distributed matrix computation via coding theory

The current BigData era routinely requires the processing of large scale data on massive distributed computing clusters. Such large scale clusters often suffer from the problem of "stragglers", which are defined as slow or failed nodes. The…

Information Theory · Computer Science 2020-02-11 Aditya Ramamoorthy , Anindya Bijoy Das , Li Tang