Related papers: Incentive Mechanism Design for Distributed Coded M…

Heterogeneous Coded Computation across Heterogeneous Workers

Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a…

Information Theory · Computer Science 2019-05-21 Yuxuan Sun , Junlin Zhao , Sheng Zhou , Deniz Gündüz

Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-29 Homa Esfahanizadeh , Alejandro Cohen , Muriel Medard

Coded Computation across Shared Heterogeneous Workers with Communication Delay

Distributed computing enables large-scale computation tasks to be processed over multiple workers in parallel. However, the randomness of communication and computation delays across workers causes the straggler effect, which may degrade the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-20 Yuxuan Sun , Fan Zhang , Junlin Zhao , Sheng Zhou , Zhisheng Niu , Deniz Gündüz

Incentive Mechanism Design for Distributed Ensemble Learning

Distributed ensemble learning (DEL) involves training multiple models at distributed learners, and then combining their predictions to improve performance. Existing related studies focus on DEL algorithm design and optimization but ignore…

Computer Science and Game Theory · Computer Science 2023-10-16 Chao Huang , Pengchao Han , Jianwei Huang

Hierarchical Coded Matrix Multiplication

Slow working nodes, known as stragglers, can greatly reduce the speed of distributed computation. Coded matrix multiplication is a recently introduced technique that enables straggler-resistant distributed multiplication of large matrices.…

Information Theory · Computer Science 2019-07-23 Shahrzad Kiani , Nuwan Ferdinand , Stark C. Draper

Optimal Load Allocation for Coded Distributed Computation in Heterogeneous Clusters

Recently, coding has been a useful technique to mitigate the effect of stragglers in distributed computing. However, coding in this context has been mainly explored under the assumption of homogeneous workers, although the real-world…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-18 DaeJin Kim , Hyegyeong Park , Junkyun Choi

Combating Computational Heterogeneity in Large-Scale Distributed Computing via Work Exchange

Owing to data-intensive large-scale applications, distributed computation systems have gained significant recent interest, due to their ability of running such tasks over a large number of commodity nodes in a time efficient manner. One of…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-23 Mohamed A. Attia , Ravi Tandon

Design and Optimization of Hierarchical Gradient Coding for Distributed Learning at Edge Devices

Edge computing has recently emerged as a promising paradigm to boost the performance of distributed learning by leveraging the distributed resources at edge nodes. Architecturally, the introduction of edge nodes adds an additional…

Networking and Internet Architecture · Computer Science 2024-06-18 Weiheng Tang , Jingyi Li , Lin Chen , Xu Chen

Coded Distributed Computing with Partial Recovery

Coded computation techniques provide robustness against straggling workers in distributed computing. However, most of the existing schemes require exact provisioning of the straggling behaviour and ignore the computations carried out by…

Information Theory · Computer Science 2021-12-07 Emre Ozfatura , Sennur Ulukus , Deniz Gunduz

Stream Distributed Coded Computing

The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the…

Information Theory · Computer Science 2021-03-03 Alejandro Cohen , Guillaume Thiran , Homa Esfahanizadeh , Muriel Médard

Motivating Workers in Federated Learning: A Stackelberg Game Perspective

Due to the large size of the training data, distributed learning approaches such as federated learning have gained attention recently. However, the convergence rate of distributed learning suffers from heterogeneous worker performance. In…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-09 Yunus Sarikaya , Ozgur Ercetin

Distributed Optimization using Heterogeneous Compute Systems

Hardware compute power has been growing at an unprecedented rate in recent years. The utilization of such advancements plays a key role in producing better results in less time -- both in academia and industry. However, merging the existing…

Machine Learning · Computer Science 2021-10-19 Vineeth S

Hierarchical Coded Matrix Multiplication

In distributed computing systems slow working nodes, known as stragglers, can greatly extend finishing times. Coded computing is a technique that enables straggler-resistant computation. Most coded computing techniques presented to date…

Information Theory · Computer Science 2021-02-02 Shahrzad Kiani , Nuwan Ferdinand , Stark C. Draper

Distributed Linearly Separable Computation

This paper formulates a distributed computation problem, where a master asks $N$ distributed workers to compute a linearly separable function. The task function can be expressed as $K_c$ linear combinations of $K$ messages, where each…

Information Theory · Computer Science 2021-10-26 Kai Wan , Hua Sun , Mingyue Ji , Giuseppe Caire

Optimal Prizes for All-Pay Contests in Heterogeneous Crowdsourcing

Incentives are key to the success of crowdsourcing which heavily depends on the level of user participation. This paper designs an incentive mechanism to motivate a heterogeneous crowd of users to actively participate in crowdsourcing…

Multiagent Systems · Computer Science 2018-12-13 Tie Luo , Salil S. Kanhere , Sajal K. Das , Hwee-Pink Tan

Heterogeneity-aware Gradient Coding for Straggler Tolerance

Gradient descent algorithms are widely used in machine learning. In order to deal with huge volume of data, we consider the implementation of gradient descent algorithms in a distributed computing setting where multiple workers compute the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-29 Haozhao Wang , Song Guo , Bin Tang , Ruixuan Li , Chengjie Li

Hierarchical Coded Computation

Coded computation is a method to mitigate "stragglers" in distributed computing systems through the use of error correction coding that has lately received significant attention. First used in vector-matrix multiplication, the range of…

Information Theory · Computer Science 2018-06-28 Nuwan Ferdinand , Stark Draper

Fast and Straggler-Tolerant Distributed SGD with Reduced Computation Load

In distributed machine learning, a central node outsources computationally expensive calculations to external worker nodes. The properties of optimization procedures like stochastic gradient descent (SGD) can be leveraged to mitigate the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-19 Maximilian Egger , Serge Kas Hanna , Rawad Bitar

Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding

We consider the problem of massive matrix multiplication, which underlies many data analytic applications, in a large-scale distributed system comprising a group of worker nodes. We target the stragglers' delay performance bottleneck, which…

Information Theory · Computer Science 2020-04-10 Qian Yu , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Coded Computation over Heterogeneous Clusters

In large-scale distributed computing clusters, such as Amazon EC2, there are several types of "system noise" that can result in major degradation of performance: bottlenecks due to limited communication bandwidth, latency due to straggler…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-21 Amirhossein Reisizadeh , Saurav Prakash , Ramtin Pedarsani , Amir Salman Avestimehr