Related papers: Resolvable Designs for Speeding up Distributed Com…

Leveraging Coding Techniques for Speeding up Distributed Computing

Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The…

Information Theory · Computer Science 2018-02-12 Konstantinos Konstantinidis , Aditya Ramamoorthy

Heterogeneous Coded Distributed Computing: Joint Design of File Allocation and Function Assignment

This paper studies the computation-communication tradeoff in a heterogeneous MapReduce computing system where each distributed node is equipped with different computation capability. We first obtain an achievable communication load for any…

Information Theory · Computer Science 2019-08-20 Fan Xu , Meixia Tao

Coded MapReduce

MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-12-08 Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

How to Optimally Allocate Resources for Coded Distributed Computing?

Today's data centers have an abundance of computing resources, hosting server clusters consisting of as many as tens or hundreds of thousands of machines. To execute a complex computing task over a data center, it is natural to distribute…

Information Theory · Computer Science 2017-02-24 Qian Yu , Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Stream Distributed Coded Computing

The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the…

Information Theory · Computer Science 2021-03-03 Alejandro Cohen , Guillaume Thiran , Homa Esfahanizadeh , Muriel Médard

Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-29 Homa Esfahanizadeh , Alejandro Cohen , Muriel Medard

A New Combinatorial Design of Coded Distributed Computing

Coded distributed computing introduced by Li et al. in 2015 is an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce. In particular, Li et al. show that…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-13 Nicholas Woolsey , Rong-Rong Chen , Mingyue Ji

Wireless MapReduce Arrays for Coded Distributed Computing

We consider a wireless distributed computing system based on the MapReduce framework, which consists of three phases: \textit{Map}, \textit{Shuffle}, and \textit{Reduce}. The system consists of a set of distributed nodes assigned to compute…

Information Theory · Computer Science 2024-06-25 Elizabath Peter , K. K. Krishnan Namboodiri , B. Sundar Rajan

Distributed Computing with Heterogeneous Communication Constraints: The Worst-Case Computation Load and Proof by Contradiction

We consider a distributed computing framework where the distributed nodes have different communication capabilities, motivated by the heterogeneous networks in data centers and mobile edge computing systems. Following the structure of…

Information Theory · Computer Science 2019-08-20 Nishant Shakya , Fan Li , Jinyuan Chen

Distributed Computations with Layered Resolution

Modern computationally-heavy applications are often time-sensitive, demanding distributed strategies to accelerate them. On the other hand, distributed computing suffers from the bottleneck of slow workers in practice. Distributed coded…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-03 Homa Esfahanizadeh , Alejandro Cohen , Muriel Médard , Shlomo Shamai

Locality-Aware Hybrid Coded MapReduce for Server-Rack Architecture

MapReduce is a widely used framework for distributed computing. Data shuffling between the Map phase and Reduce phase of a job involves a large amount of data transfer across servers, which in turn accounts for increase in job completion…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-06 Sneh Gupta , V. Lalitha

CAMR: Coded Aggregated MapReduce

Many big data algorithms executed on MapReduce-like systems have a shuffle phase that often dominates the overall job execution time. Recent work has demonstrated schemes where the communication load in the shuffle phase can be traded off…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-02 Konstantinos Konstantinidis , Aditya Ramamoorthy

Speed Scaling On Parallel Servers with MapReduce Type Precedence Constraints

A multiple server setting is considered, where each server has tunable speed, and increasing the speed incurs an energy cost. Jobs arrive to a single queue, and each job has two types of sub-tasks, map and reduce, and a {\bf precedence}…

Data Structures and Algorithms · Computer Science 2021-05-20 Rahul Vaze , Jayakrishnan Nair

Efficient Task Replication for Fast Response Times in Parallel Computation

One typical use case of large-scale distributed computing in data centers is to decompose a computation job into many independent tasks and run them in parallel on different machines, sometimes known as the "embarrassingly parallel"…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-04-07 Da Wang , Gauri Joshi , Gregory Wornell

Low Complexity Distributed Computing via Binary Matrices with Extension to Stragglers

We consider the distributed computing framework of MapReduce, which consists of three phases, the Map phase, the Shuffle phase and the Reduce phase. For this framework, we propose the use of binary matrices (with $0,1$ entries) called…

Information Theory · Computer Science 2020-02-03 Shailja Agrawal , Prasad Krishnan

Wireless MapReduce Distributed Computing

Motivated by mobile edge computing and wireless data centers, we study a wireless distributed computing framework where the distributed nodes exchange information over a wireless interference network. Our framework follows the structure of…

Information Theory · Computer Science 2018-10-19 Fan Li , Jinyuan Chen , Zhiying Wang

Balanced Nonadaptive Redundancy Scheduling

Distributed computing systems implement redundancy to reduce the job completion time and variability. Despite a large body of work about computing redundancy, the analytical performance evaluation of redundancy techniques in queuing systems…

Information Theory · Computer Science 2022-01-05 Amir Behrouzi-Far , Emina Soljanin

Optimizing MapReduce for Highly Distributed Environments

MapReduce, the popular programming paradigm for large-scale data processing, has traditionally been deployed over tightly-coupled clusters where the data is already locally available. The assumption that the data and compute resources are…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-07-31 Benjamin Heintz , Abhishek Chandra , Ramesh K. Sitaraman

Diversity/Parallelism Trade-off in Distributed Systems with Redundancy

As numerous machine learning and other algorithms increase in complexity and data requirements, distributed computing becomes necessary to satisfy the growing computational and storage demands, because it enables parallel execution of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-21 Pei Peng , Emina Soljanin , Philip Whiting

Scheduling MapReduce Jobs and Data Shuffle on Unrelated Processors

We propose constant approximation algorithms for generalizations of the Flexible Flow Shop (FFS) problem which form a realistic model for non-preemptive scheduling in MapReduce systems. Our results concern the minimization of the total…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-06-25 Dimitrios Fotakis , Ioannis Milis , Emmanouil Zampetakis , Georgios Zois