Related papers: Algebraic Geometric Rook Codes for Coded Distribut…
We construct optimal secure coded distributed schemes that extend the known optimal constructions over fields of characteristic 0 to all fields. A serendipitous result is that we can encode \emph{all} functions over finite fields with a…
In this paper, we introduce distributed matrix multiplication (DMM)-friendly algebraic function fields for polynomial codes and Matdot codes, and present several constructions for such function fields through extensions of the rational…
We show that polynomial codes (and some related codes) used for distributed matrix multiplication are interleaved Reed-Solomon codes and, hence, can be collaboratively decoded. We consider a fault tolerant setup where $t$ worker nodes…
Code-based Distributed Matrix Multiplication (DMM) has been extensively studied in distributed computing for efficiently performing large-scale matrix multiplication using coding theoretic techniques. The communication cost and recovery…
We consider a large-scale matrix multiplication problem where the computation is carried out using a distributed system with a master node and multiple worker nodes, where each worker can store parts of the input matrices. We propose a…
We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $k$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $n$ distributed…
Tensor operations, such as matrix multiplication, are central to large-scale machine learning applications. For user-driven tasks these operations can be carried out on a distributed computing platform with a master server at the user side…
Fault tolerance is a major concern in distributed computational settings. In the classic master-worker setting, a server (the master) needs to perform some heavy computation which it may distribute to $m$ other machines (workers) in order…
We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $k$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $n$ distributed…
Matrix multiplication is a fundamental building block for large scale computations arising in various applications, including machine learning. There has been significant recent interest in using coding to speed up distributed matrix…
We provide novel coded computation strategies for distributed matrix-matrix products that outperform the recent "Polynomial code" constructions in recovery threshold, i.e., the required number of successful workers. When $m$-th fraction of…
This paper considers the problem of calculating the matrix multiplication of two massive matrices $\mathbf{A}$ and $\mathbf{B}$ distributedly. We provide a modulo technique that can be applied to coded distributed matrix multiplication…
We propose two coding schemes for distributed matrix multiplication in the presence of stragglers. These coding schemes are adaptations of LT codes and Raptor codes to distributed matrix multiplication and are termed \emph{factored LT (FLT)…
Coded computation techniques provide robustness against straggling workers in distributed computing. However, most of the existing schemes require exact provisioning of the straggling behaviour and ignore the computations carried out by…
This paper addresses the gradient coding and coded matrix multiplication problems in distributed optimization and coded computing. We present a numerically stable binary coding method which overcomes the drawbacks of the \textit{Fractional…
Supporting multiple partial computations efficiently at each of the workers is a keystone in distributed coded computing in order to speed up computations and to fully exploit the resources of heterogeneous workers in terms of…
Coded computation is an emerging research area that leverages concepts from erasure coding to mitigate the effect of stragglers (slow nodes) in distributed computation clusters, especially for matrix computation problems. In this work, we…
The problem of straggler mitigation in distributed matrix multiplication (DMM) is considered for a large number of worker nodes and a fixed small finite field. Polynomial codes and matdot codes are generalized by making use of algebraic…
In this paper, due to the important value in practical applications, we consider the coded distributed matrix multiplication problem of computing $AA^\top$ in a distributed computing system with $N$ worker nodes and a master node, where the…
Distributed matrix multiplication is widely used in several scientific domains. It is well recognized that computation times on distributed clusters are often dominated by the slowest workers (called stragglers). Recent work has…