Related papers: Coded Computing for Master-Aided Distributed Compu…
Today's data centers have an abundance of computing resources, hosting server clusters consisting of as many as tens or hundreds of thousands of machines. To execute a complex computing task over a data center, it is natural to distribute…
Coded distributed computing (CDC), proposed by Li \emph{et al.}, offers significant potential for reducing the communication load in MapReduce computing systems. In cascaded CDC with $K$ nodes, $N$ input files, and $Q$ output functions,…
Coded distributed computing (CDC) introduced by Li et. al. is an effective technique to trade computation load for communication load in a MapReduce framework. CDC achieves an optimal trade-off by duplicating map computations at $r$…
Coded distributed computing (CDC) is a new technique proposed with the purpose of decreasing the intense data exchange required for parallelizing distributed computing systems. Under the famous MapReduce paradigm, this coded approach has…
Distributed computing frameworks such as MapReduce and Spark are often used to process large-scale data computing jobs. In wireless scenarios, exchanging data among distributed nodes would seriously suffer from the communication bottleneck…
This work explores a distributed computing setting where $K$ nodes are assigned fractions (subtasks) of a computational task in order to perform the computation in parallel. In this setting, a well-known main bottleneck has been the…
A central issue of distributed computing systems is how to optimally allocate computing and storage resources and design data shuffling strategies such that the total execution time for computing and data shuffling is minimized. This is…
Coding theoretic approached have been developed to significantly reduce the communication load in modern distributed computing system. In particular, coded distributed computing (CDC) introduced by Li et al. can efficiently trade…
Coded Distributed Computing (CDC) introduced by Li et al. in 2015 offers an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce and Spark. In particular,…
Coded distributed computing introduced by Li et al. in 2015 is an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce. In particular, Li et al. show that…
Coded distributed computing (CDC) introduced by Li et al. in 2015 offers an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce. For the more general…
MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication…
Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing…
Consider a distributed computing system in which the worker nodes are connected over a shared wireless channel. Nodes can store a fraction of the data set over which computation needs to be carried out, and a Map-Shuffle-Reduce protocol is…
This paper studies the computation-communication tradeoff in a heterogeneous MapReduce computing system where each distributed node is equipped with different computation capability. We first obtain an achievable communication load for any…
Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The…
Communication overhead is one of the major performance bottlenecks in large-scale distributed computing systems, in particular for machine learning applications. Conventionally, compression techniques are used to reduce the load of…
We consider a coded distributed computing problem in a ring-based communication network, where $N$ computing nodes are arranged in a ring topology and each node can only communicate with its neighbors within a constant distance $d$. To…
We consider a distributed computing framework where the distributed nodes have different communication capabilities, motivated by the heterogeneous networks in data centers and mobile edge computing systems. Following the structure of…
Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data…