English
Related papers

Related papers: Block-Diagonal Coding for Distributed Computing Wi…

200 papers

We propose two coded schemes for the distributed computing problem of multiplying a matrix by a set of vectors. The first scheme is based on partitioning the matrix into submatrices and applying maximum distance separable (MDS) codes to…

Information Theory · Computer Science 2018-10-22 Albin Severinson , Alexandre Graell i Amat , Eirik Rosnes

We propose a unified coded framework for distributed computing with straggling servers, by introducing a tradeoff between "latency of computation" and "load of communication" for some linear computation tasks. We show that the coded scheme…

Information Theory · Computer Science 2016-10-26 Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, need to overcome two limitations: delays caused by slow running machines called 'stragglers', and…

Information Theory · Computer Science 2020-05-15 Swanand Kadhe , O. Ozan Koyluoglu , Kannan Ramchandran

Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not been available that reflect the…

Information Theory · Computer Science 2019-01-17 Muah Kim , Jy-yong Sohn , Jaekyun Moon

Distributed multi-task learning (DMTL) effectively improves model generalization performance through the collaborative training of multiple related models. However, in large-scale learning scenarios, communication bottlenecks severely limit…

Information Theory · Computer Science 2025-07-25 Minquan Cheng , Yongkang Wang , Lingyu Zhang , Youlong Wu

Inexpensive cloud services, such as serverless computing, are often vulnerable to straggling nodes that increase end-to-end latency for distributed computation. We propose and implement simple yet principled approaches for straggler…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-22 Vipul Gupta , Dominic Carrano , Yaoqing Yang , Vaishaal Shankar , Thomas Courtade , Kannan Ramchandran

Coding for distributed computing supports low-latency computation by relieving the burden of straggling workers. While most existing works assume a simple master-worker model, we consider a hierarchical computational structure consisting of…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Hyegyeong Park , Kangwook Lee , Jy-yong Sohn , Changho Suh , Jaekyun Moon

Distributed computing enables large-scale computation tasks to be processed over multiple workers in parallel. However, the randomness of communication and computation delays across workers causes the straggler effect, which may degrade the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-20 Yuxuan Sun , Fan Zhang , Junlin Zhao , Sheng Zhou , Zhisheng Niu , Deniz Gündüz

The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the…

Information Theory · Computer Science 2021-03-03 Alejandro Cohen , Guillaume Thiran , Homa Esfahanizadeh , Muriel Médard

Large-scale distributed computing systems face two major bottlenecks that limit their scalability: straggler delay caused by the variability of computation times at different worker nodes and communication bottlenecks caused by shuffling…

Information Theory · Computer Science 2017-07-04 Amirhossein Reisizadeh , Ramtin Pedarsani

Building on the previous work of Lee et al. and Ferdinand et al. on coded computation, we propose a sequential approximation framework for solving optimization problems in a distributed manner. In a distributed computation system, latency…

Information Theory · Computer Science 2017-10-26 Jingge Zhu , Ye Pu , Vipul Gupta , Claire Tomlin , Kannan Ramchandran

Distributed computation is a framework used to break down a complex computational task into smaller tasks and distributing them among computational nodes. Erasure correction codes have recently been introduced and have become a popular…

Information Theory · Computer Science 2021-08-17 Royee Yosibash , Ram Zamir

Distributed linearly separable computation, where a user asks some distributed servers to compute a linearly separable function, was recently formulated by the same authors and aims to alleviate the bottlenecks of stragglers and…

Information Theory · Computer Science 2021-02-02 Kai Wan , Hua Sun , Mingyue Ji , Giuseppe Caire

Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems there are several types of noise that can affect the performance of distributed machine learning algorithms -- straggler nodes,…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-30 Kangwook Lee , Maximilian Lam , Ramtin Pedarsani , Dimitris Papailiopoulos , Kannan Ramchandran

Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum…

Information Theory · Computer Science 2023-08-24 Anindya Bijoy Das , Aditya Ramamoorthy , David J. Love , Christopher G. Brinton

Matrix computations are a fundamental building-block of edge computing systems, with a major recent uptick in demand due to their use in AI/ML training and inference procedures. Existing approaches for distributing matrix computations…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-12 Anindya Bijoy Das , Aditya Ramamoorthy , David J. Love , Christopher G. Brinton

In a distributed computing system operating according to the map-shuffle-reduce framework, coding data prior to storage can be useful both to reduce the latency caused by straggling servers and to decrease the inter-server communication…

Information Theory · Computer Science 2018-08-22 Jingjing Zhang , Osvaldo Simeone

Coded matrix multiplication is a technique to enable straggler-resistant multiplication of large matrices in distributed computing systems. In this paper, we first present a conceptual framework to represent the division of work amongst…

Information Theory · Computer Science 2019-07-23 Shahrzad Kiani , Nuwan Ferdinand , Stark C. Draper

Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a…

Information Theory · Computer Science 2019-05-21 Yuxuan Sun , Junlin Zhao , Sheng Zhou , Deniz Gündüz

Coded computing is an effective technique to mitigate "stragglers" in large-scale and distributed matrix multiplication. In particular, univariate polynomial codes have been shown to be effective in straggler mitigation by making the…

Information Theory · Computer Science 2021-08-19 Burak Hasircioglu , Jesus Gomez-Vilardebo , Deniz Gunduz
‹ Prev 1 2 3 10 Next ›