Related papers: Coded Distributed Computing: Performance Limits an…
We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $k$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $n$ distributed…
Modern computationally-heavy applications are often time-sensitive, demanding distributed strategies to accelerate them. On the other hand, distributed computing suffers from the bottleneck of slow workers in practice. Distributed coded…
Coded computation is a framework which provides redundancy in distributed computing systems to speed up largescale tasks. Although most existing works assume an error-free scenarios in a master-worker setup, the link failures are common in…
Coded computing has emerged as a promising framework for tackling significant challenges in large-scale distributed computing, including the presence of slow, faulty, or compromised servers. In this approach, each worker node processes a…
To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several…
We consider the problem of coded distributed computing using polar codes. The average execution time of a coded computing system is related to the error probability for transmission over the binary erasure channel in recent work by…
Distributed computation is a framework used to break down a complex computational task into smaller tasks and distributing them among computational nodes. Erasure correction codes have recently been introduced and have become a popular…
Coded computing is a distributed paradigm that uses coding theory to introduce \textit{redundancy} and overcome bottlenecks in large-scale systems. In the same vein, randomized numerical linear algebra employs probabilistic methods to…
This paper formulates a distributed computation problem, where a master asks $N$ distributed workers to compute a linearly separable function. The task function can be expressed as $K_c$ linear combinations of $K$ messages, where each…
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching…
The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the…
Coded distributed computing introduced by Li et al. in 2015 is an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce. In particular, Li et al. show that…
This work aims to jointly optimize the coding and node selection to minimize the processing time for distributed computing tasks over wireless edge networks. Since the joint optimization problem formulation is NP-hard and nonlinear, we…
Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not been available that reflect the…
Distributed computing enables large-scale computation tasks to be processed over multiple workers in parallel. However, the randomness of communication and computation delays across workers causes the straggler effect, which may degrade the…
Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The…
We introduce an error resilient distributed computing method based on an extension of the channel polarization phenomenon to distributed algorithms. The method leverages an algorithmic split operation that transforms two identical compute…
Linear computation coding is concerned with the compression of multidimensional linear functions, i.e. with reducing the computational effort of multiplying an arbitrary vector to an arbitrary, but known, constant matrix. This paper…
Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a…
Today's data centers have an abundance of computing resources, hosting server clusters consisting of as many as tens or hundreds of thousands of machines. To execute a complex computing task over a data center, it is natural to distribute…