English
Related papers

Related papers: CAMR: Coded Aggregated MapReduce

200 papers

This paper studies the computation-communication tradeoff in a heterogeneous MapReduce computing system where each distributed node is equipped with different computation capability. We first obtain an achievable communication load for any…

Information Theory · Computer Science 2019-08-20 Fan Xu , Meixia Tao

Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The…

Information Theory · Computer Science 2018-02-12 Konstantinos Konstantinidis , Aditya Ramamoorthy

MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-12-08 Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The servers also need to exchange intermediate…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-20 Konstantinos Konstantinidis , Aditya Ramamoorthy

Coded distributed computing introduced by Li et al. in 2015 is an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce. In particular, Li et al. show that…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-13 Nicholas Woolsey , Rong-Rong Chen , Mingyue Ji

MapReduce is a widely used framework for distributed computing. Data shuffling between the Map phase and Reduce phase of a job involves a large amount of data transfer across servers, which in turn accounts for increase in job completion…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-06 Sneh Gupta , V. Lalitha

Load balance is important for MapReduce to reduce job duration, increase parallel efficiency, etc. Previous work focuses on coarse-grained scheduling. This study concerns fine-grained scheduling on MapReduce operations. Each operation…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-04-15 Liya Fan , Bo Gao , Xi Sun , Fa Zhang , Zhiyong Liu

We consider a wireless distributed computing system based on the MapReduce framework, which consists of three phases: \textit{Map}, \textit{Shuffle}, and \textit{Reduce}. The system consists of a set of distributed nodes assigned to compute…

Information Theory · Computer Science 2024-06-25 Elizabath Peter , K. K. Krishnan Namboodiri , B. Sundar Rajan

Today's data centers have an abundance of computing resources, hosting server clusters consisting of as many as tens or hundreds of thousands of machines. To execute a complex computing task over a data center, it is natural to distribute…

Information Theory · Computer Science 2017-02-24 Qian Yu , Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

We consider the distributed computing framework of MapReduce, which consists of three phases, the Map phase, the Shuffle phase and the Reduce phase. For this framework, we propose the use of binary matrices (with $0,1$ entries) called…

Information Theory · Computer Science 2020-02-03 Shailja Agrawal , Prasad Krishnan

Undoubtedly, the MapReduce is the most powerful programming paradigm in distributed computing. The enhancement of the MapReduce is essential and it can lead the computing faster. Therefore, here are many scheduling algorithms to discuss…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-04-11 Rajdeep Das , Rohit Pratap Singh , Ripon Patgiri

Motivated by mobile edge computing and wireless data centers, we study a wireless distributed computing framework where the distributed nodes exchange information over a wireless interference network. Our framework follows the structure of…

Information Theory · Computer Science 2018-10-19 Fan Li , Jinyuan Chen , Zhiying Wang

We consider a distributed computing framework where the distributed nodes have different communication capabilities, motivated by the heterogeneous networks in data centers and mobile edge computing systems. Following the structure of…

Information Theory · Computer Science 2019-08-20 Nishant Shakya , Fan Li , Jinyuan Chen

Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a…

Information Theory · Computer Science 2019-05-21 Yuxuan Sun , Junlin Zhao , Sheng Zhou , Deniz Gündüz

We consider a MapReduce-type task running in a distributed computing model which consists of ${K}$ edge computing nodes distributed across the edge of the network and a Master node that assists the edge nodes to compute output functions.…

Information Theory · Computer Science 2020-10-22 Haoning Chen , Youlong Wu

Coded distributed computing (CDC) introduced by Li et. al. is an effective technique to trade computation load for communication load in a MapReduce framework. CDC achieves an optimal trade-off by duplicating map computations at $r$…

Information Theory · Computer Science 2019-03-04 Nicholas Woolsey , Rong-Rong Chen , Mingyue Ji

In this paper, we present a coded computation (CC) scheme for distributed computation of the inference phase of machine learning (ML) tasks, specifically, the task of image classification. Building upon Agrawal et al.~2022, the proposed…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-12 Jiepeng Tang , Navneet Agrawal , Slawomir Stanczak , Jingge Zhu

MapReduce has proven to be one of the most useful paradigms in the revolution of distributed computing, where cloud services and cluster computing become the standard venue for computing. The federation of cloud and big data activities is…

Databases · Computer Science 2016-07-29 Foto Afrati , Shlomi Dolev , Shantanu Sharma , Jeffrey D. Ullman

Since its introduction in 2004, the MapReduce framework has become one of the standard approaches in massive distributed and parallel computation. In contrast to its intensive use in practise, theoretical footing is still limited and only…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-12-19 Gero Greiner , Riko Jacob

Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing…

Information Theory · Computer Science 2016-10-03 Mohamed Attia , Ravi Tandon
‹ Prev 1 2 3 10 Next ›