English
Related papers

Related papers: Frame Codes For Distributed Coded Computation

200 papers

Coded computation is a method to mitigate "stragglers" in distributed computing systems through the use of error correction coding that has lately received significant attention. First used in vector-matrix multiplication, the range of…

Information Theory · Computer Science 2018-06-28 Nuwan Ferdinand , Stark Draper

Coded computing has emerged as a promising framework for tackling significant challenges in large-scale distributed computing, including the presence of slow, faulty, or compromised servers. In this approach, each worker node processes a…

Machine Learning · Computer Science 2026-03-26 Parsa Moradi , Behrooz Tahmasebi , Mohammad Ali Maddah-Ali

We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $k$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $n$ distributed…

Information Theory · Computer Science 2019-06-25 Mohammad Vahid Jamali , Mahdi Soleymani , Hessam Mahdavifar

Coded computation is a framework which provides redundancy in distributed computing systems to speed up largescale tasks. Although most existing works assume an error-free scenarios in a master-worker setup, the link failures are common in…

Information Theory · Computer Science 2019-01-14 Dong-Jun Han , Jy-yong Sohn , Jaekyun Moon

Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing…

Networking and Internet Architecture · Computer Science 2008-03-06 Alexandros G. Dimakis , P. Brighten Godfrey , Yunnan Wu , Martin J. Wainwright , Kannan Ramchandran

Erasure codes are an efficient means of storing data across a network in comparison to data replication, as they tend to reduce the amount of data stored in the network and offer increased resilience in the presence of node failures. The…

Information Theory · Computer Science 2016-11-17 K. V. Rashmi , Nihar B. Shah , P. Vijay Kumar

The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the…

Information Theory · Computer Science 2021-03-03 Alejandro Cohen , Guillaume Thiran , Homa Esfahanizadeh , Muriel Médard

Coded computing is a distributed paradigm that uses coding theory to introduce \textit{redundancy} and overcome bottlenecks in large-scale systems. In the same vein, randomized numerical linear algebra employs probabilistic methods to…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-19 Neophytos Charalambides , Arya Mazumdar

Distributed matrix computations over large clusters can suffer from the problem of slow or failed worker nodes (called stragglers) which can dominate the overall job execution time. Coded computation utilizes concepts from erasure coding to…

Information Theory · Computer Science 2021-09-27 Anindya Bijoy Das , Aditya Ramamoorthy

Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems there are several types of noise that can affect the performance of distributed machine learning algorithms -- straggler nodes,…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-30 Kangwook Lee , Maximilian Lam , Ramtin Pedarsani , Dimitris Papailiopoulos , Kannan Ramchandran

In distributed computing systems, it is well recognized that worker nodes that are slow (called stragglers) tend to dominate the overall job execution time. Coded computation utilizes concepts from erasure coding to mitigate the effect of…

Information Theory · Computer Science 2018-09-18 Anindya B. Das , Li Tang , Aditya Ramamoorthy

We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $k$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $n$ distributed…

Information Theory · Computer Science 2021-10-06 Mahdi Soleymani , Mohammad Vahid Jamali , Hessam Mahdavifar

We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-06 Burak Bartan , Mert Pilanci

Distributed storage systems often introduce redundancy to increase reliability. When coding is used, the repair problem arises: if a node storing encoded information fails, in order to maintain the same level of reliability we need to…

Information Theory · Computer Science 2010-04-27 Alexandros G. Dimakis , Kannan Ramchandran , Yunnan Wu , Changho Suh

Distributed computing systems are well-known to suffer from the problem of slow or failed nodes; these are referred to as stragglers. Straggler mitigation (for distributed matrix computations) has recently been investigated from the…

Information Theory · Computer Science 2024-12-20 Anindya Bijoy Das , Aditya Ramamoorthy

Matrix computations are a fundamental building-block of edge computing systems, with a major recent uptick in demand due to their use in AI/ML training and inference procedures. Existing approaches for distributing matrix computations…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-12 Anindya Bijoy Das , Aditya Ramamoorthy , David J. Love , Christopher G. Brinton

Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum…

Information Theory · Computer Science 2023-08-24 Anindya Bijoy Das , Aditya Ramamoorthy , David J. Love , Christopher G. Brinton

The current BigData era routinely requires the processing of large scale data on massive distributed computing clusters. Such large scale clusters often suffer from the problem of "stragglers", which are defined as slow or failed nodes. The…

Information Theory · Computer Science 2020-02-11 Aditya Ramamoorthy , Anindya Bijoy Das , Li Tang

Performance of distributed optimization and learning systems is bottlenecked by "straggler" nodes and slow communication links, which significantly delay computation. We propose a distributed optimization framework where the dataset is…

Machine Learning · Statistics 2018-03-15 Can Karakus , Yifan Sun , Suhas Diggavi , Wotao Yin

Distributed computing has become a common approach for large-scale computation of tasks due to benefits such as high reliability, scalability, computation speed, and costeffectiveness. However, distributed computing faces critical issues…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-21 Jer Shyuan Ng , Wei Yang Bryan Lim , Nguyen Cong Luong , Zehui Xiong , Alia Asheralieva , Dusit Niyato , Cyril Leung , Chunyan Miao
‹ Prev 1 2 3 10 Next ›