Related papers: Distributed Matrix-Vector Multiplication: A Convol…
Distributed matrix computations -- matrix-matrix or matrix-vector multiplications -- are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). Much of prior work in this area is (i) either sub-optimal in…
Coded computation is an emerging research area that leverages concepts from erasure coding to mitigate the effect of stragglers (slow nodes) in distributed computation clusters, especially for matrix computation problems. In this work, we…
Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum…
Distributed matrix multiplication is widely used in several scientific domains. It is well recognized that computation times on distributed clusters are often dominated by the slowest workers (called stragglers). Recent work has…
In distributed computing systems, it is well recognized that worker nodes that are slow (called stragglers) tend to dominate the overall job execution time. Coded computation utilizes concepts from erasure coding to mitigate the effect of…
Distributed matrix computations over large clusters can suffer from the problem of slow or failed worker nodes (called stragglers) which can dominate the overall job execution time. Coded computation utilizes concepts from erasure coding to…
We study the problem of computing matrix chain multiplications in a distributed computing cluster. In such systems, performance is often limited by the straggler problem, where the slowest worker dominates the overall computation latency.…
Building on the previous work of Lee et al. and Ferdinand et al. on coded computation, we propose a sequential approximation framework for solving optimization problems in a distributed manner. In a distributed computation system, latency…
The current BigData era routinely requires the processing of large scale data on massive distributed computing clusters. Such large scale clusters often suffer from the problem of "stragglers", which are defined as slow or failed nodes. The…
Coded computation is a method to mitigate "stragglers" in distributed computing systems through the use of error correction coding that has lately received significant attention. First used in vector-matrix multiplication, the range of…
Coded computation techniques provide robustness against straggling workers in distributed computing. However, most of the existing schemes require exact provisioning of the straggling behaviour and ignore the computations carried out by…
This paper considers the problem of calculating the matrix multiplication of two massive matrices $\mathbf{A}$ and $\mathbf{B}$ distributedly. We provide a modulo technique that can be applied to coded distributed matrix multiplication…
The overall execution time of distributed matrix computations is often dominated by slow worker nodes (stragglers) within the clusters. Recently, different coding techniques have been utilized to mitigate the effect of stragglers where…
We consider the problem of massive matrix multiplication, which underlies many data analytic applications, in a large-scale distributed system comprising a group of worker nodes. We target the stragglers' delay performance bottleneck, which…
Coded computing is an effective technique to mitigate "stragglers" in large-scale and distributed matrix multiplication. In particular, univariate polynomial codes have been shown to be effective in straggler mitigation by making the…
In a large-scale and distributed matrix multiplication problem $C=A^{\intercal}B$, where $C\in\mathbb{R}^{r\times t}$, the coded computation plays an important role to effectively deal with "stragglers" (distributed computations that may…
We consider the problem of computing the convolution of two long vectors using parallel processing units in the presence of "stragglers". Stragglers refer to the small fraction of faulty or slow processors that delays the entire computation…
Slow working nodes, known as stragglers, can greatly reduce the speed of distributed computation. Coded matrix multiplication is a recently introduced technique that enables straggler-resistant distributed multiplication of large matrices.…
The problem of distributed matrix multiplication with straggler tolerance over finite fields is considered, focusing on field sizes for which previous solutions were not applicable (for instance, the field of two elements). We employ…
Today's massively-sized datasets have made it necessary to often perform computations on them in a distributed manner. In principle, a computational task is divided into subtasks which are distributed over a cluster operated by a…