English
Related papers

Related papers: Coded Distributed Computing with Partial Recovery

200 papers

Coded computation techniques provide robustness against straggling servers in distributed computing, with the following limitations: First, they increase decoding complexity. Second, they ignore computations carried out by straggling…

Machine Learning · Computer Science 2018-11-29 Emre Ozfatura , Sennur Ulukus , Deniz Gunduz

We consider the problem of training a least-squares regression model on a large dataset using gradient descent. The computation is carried out on a distributed system consisting of a master node and multiple worker nodes. Such distributed…

Information Theory · Computer Science 2018-05-28 Songze Li , Seyed Mohammadreza Mousavi Kalan , Qian Yu , Mahdi Soltanolkotabi , A. Salman Avestimehr

Coded computation can be used to speed up distributed learning in the presence of straggling workers. Partial recovery of the gradient vector can further reduce the computation time at each iteration; however, this can result in biased…

Information Theory · Computer Science 2020-06-03 Emre Ozfatura , Baturalp Buyukates , Deniz Gunduz , Sennur Ulukus

In a large-scale and distributed matrix multiplication problem $C=A^{\intercal}B$, where $C\in\mathbb{R}^{r\times t}$, the coded computation plays an important role to effectively deal with "stragglers" (distributed computations that may…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-04-19 Sinong Wang , Jiashang Liu , Ness Shroff

Distributed computing enables large-scale computation tasks to be processed over multiple workers in parallel. However, the randomness of communication and computation delays across workers causes the straggler effect, which may degrade the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-20 Yuxuan Sun , Fan Zhang , Junlin Zhao , Sheng Zhou , Zhisheng Niu , Deniz Gündüz

Coded computation is a method to mitigate "stragglers" in distributed computing systems through the use of error correction coding that has lately received significant attention. First used in vector-matrix multiplication, the range of…

Information Theory · Computer Science 2018-06-28 Nuwan Ferdinand , Stark Draper

Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a…

Information Theory · Computer Science 2019-05-21 Yuxuan Sun , Junlin Zhao , Sheng Zhou , Deniz Gündüz

The current BigData era routinely requires the processing of large scale data on massive distributed computing clusters. Such large scale clusters often suffer from the problem of "stragglers", which are defined as slow or failed nodes. The…

Information Theory · Computer Science 2020-02-11 Aditya Ramamoorthy , Anindya Bijoy Das , Li Tang

Distributed computing systems are well-known to suffer from the problem of slow or failed nodes; these are referred to as stragglers. Straggler mitigation (for distributed matrix computations) has recently been investigated from the…

Information Theory · Computer Science 2024-12-20 Anindya Bijoy Das , Aditya Ramamoorthy

In distributed computing systems, it is well recognized that worker nodes that are slow (called stragglers) tend to dominate the overall job execution time. Coded computation utilizes concepts from erasure coding to mitigate the effect of…

Information Theory · Computer Science 2018-09-18 Anindya B. Das , Li Tang , Aditya Ramamoorthy

Coded computing is an effective technique to mitigate "stragglers" in large-scale and distributed matrix multiplication. In particular, univariate polynomial codes have been shown to be effective in straggler mitigation by making the…

Information Theory · Computer Science 2021-08-19 Burak Hasircioglu , Jesus Gomez-Vilardebo , Deniz Gunduz

Distributed matrix computations over large clusters can suffer from the problem of slow or failed worker nodes (called stragglers) which can dominate the overall job execution time. Coded computation utilizes concepts from erasure coding to…

Information Theory · Computer Science 2021-09-27 Anindya Bijoy Das , Aditya Ramamoorthy

Building on the previous work of Lee et al. and Ferdinand et al. on coded computation, we propose a sequential approximation framework for solving optimization problems in a distributed manner. In a distributed computation system, latency…

Information Theory · Computer Science 2017-10-26 Jingge Zhu , Ye Pu , Vipul Gupta , Claire Tomlin , Kannan Ramchandran

Distributed matrix multiplication is widely used in several scientific domains. It is well recognized that computation times on distributed clusters are often dominated by the slowest workers (called stragglers). Recent work has…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-08 Li Tang , Konstantinos Konstantinidis , Aditya Ramamoorthy

Matrix multiplication is a fundamental building block for large scale computations arising in various applications, including machine learning. There has been significant recent interest in using coding to speed up distributed matrix…

Information Theory · Computer Science 2019-05-17 Wei-Ting Chang , Ravi Tandon

In large scale distributed linear transform problems, coded computation plays an important role to effectively deal with "stragglers" (distributed computations that may get delayed due to few slow or faulty processors). We propose a coded…

Information Theory · Computer Science 2018-04-27 Sinong Wang , Jiashang Liu , Ness Shroff , Pengyu Yang

Coded computing is a method for mitigating straggling workers in a centralized computing network, by using erasure-coding techniques. Federated learning is a decentralized model for training data distributed across client devices. In this…

Information Theory · Computer Science 2023-09-06 Neophytos Charalambides , Mert Pilanci , Alfred Hero

In cloud computing systems slow processing nodes, often referred to as "stragglers", can significantly extend the computation time. Recent results have shown that error correction coding can be used to reduce the effect of stragglers. In…

Information Theory · Computer Science 2018-06-28 Shahrzad Kiani , Nuwan Ferdinand , Stark C. Draper

We consider the problem of massive matrix multiplication, which underlies many data analytic applications, in a large-scale distributed system comprising a group of worker nodes. We target the stragglers' delay performance bottleneck, which…

Information Theory · Computer Science 2020-04-10 Qian Yu , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not been available that reflect the…

Information Theory · Computer Science 2019-01-17 Muah Kim , Jy-yong Sohn , Jaekyun Moon
‹ Prev 1 2 3 10 Next ›