Related papers: Connecting MapReduce Computations to Realistic Mac…

Semi-MapReduce Meets Congested Clique

Graph problems are troublesome when it comes to MapReduce. Typically, to be able to design algorithms that make use of the advantages of MapReduce, assumptions beyond what the model imposes, such as the density of the input graph, are…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-15 Soheil Behnezhad , Mahsa Derakhshan , MohammadTaghi Hajiaghayi

Simulating Parallel Algorithms in the MapReduce Framework with Applications to Parallel Computational Geometry

In this paper, we describe efficient MapReduce simulations of parallel algorithms specified in the BSP and PRAM models. We also provide some applications of these simulation results to problems in parallel computational geometry for the…

Data Structures and Algorithms · Computer Science 2015-03-14 Michael T. Goodrich

Sorting, Searching, and Simulation in the MapReduce Framework

In this paper, we study the MapReduce framework from an algorithmic standpoint and demonstrate the usefulness of our approach by designing and analyzing efficient MapReduce algorithms for fundamental sorting, searching, and simulation…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-01-11 Michael T. Goodrich , Nodari Sitchinava , Qin Zhang

Round Compression for Parallel Graph Algorithms in Strongly Sublinear Space

The Massive Parallel Computation (MPC) model is a theoretical framework for popular parallel and distributed platforms such as MapReduce, Hadoop, or Spark. We consider the task of computing a large matching or small vertex cover in this…

Data Structures and Algorithms · Computer Science 2018-07-24 Krzysztof Onak

Space-Round Tradeoffs for MapReduce Computations

This work explores fundamental modeling and algorithmic issues arising in the well-established MapReduce framework. First, we formally specify a computational model for MapReduce which captures the functional flavor of the paradigm by…

Data Structures and Algorithms · Computer Science 2013-06-13 Andrea Pietracaprina , Geppino Pucci , Matteo Riondato , Francesco Silvestri , Eli Upfal

The Efficiency of MapReduce in Parallel External Memory

Since its introduction in 2004, the MapReduce framework has become one of the standard approaches in massive distributed and parallel computation. In contrast to its intensive use in practise, theoretical footing is still limited and only…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-12-19 Gero Greiner , Riko Jacob

BSP vs MapReduce

The MapReduce framework has been generating a lot of interest in a wide range of areas. It has been widely adopted in industry and has been used to solve a number of non-trivial problems in academia. Putting MapReduce on strong theoretical…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-06-19 Matthew Felice Pace

Massively Parallel Algorithms for Small Subgraph Counting

Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel…

Data Structures and Algorithms · Computer Science 2022-07-19 Amartya Shankha Biswas , Talya Eden , Quanquan C. Liu , Slobodan Mitrović , Ronitt Rubinfeld

Efficient Circuit Simulation in MapReduce

The MapReduce framework has firmly established itself as one of the most widely used parallel computing platforms for processing big data on tera- and peta-byte scale. Approaching it from a theoretical standpoint has proved to be…

Computational Complexity · Computer Science 2019-12-30 Fabian Frei , Koichi Wada

Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice

We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the Adaptive Massively Parallel Computation (AMPC)…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-25 Soheil Behnezhad , Laxman Dhulipala , Hossein Esfandiari , Jakub Łącki , Vahab Mirrokni , Warren Schudy

MapReduce Meets Fine-Grained Complexity: MapReduce Algorithms for APSP, Matrix Multiplication, 3-SUM, and Beyond

Distributed processing frameworks, such as MapReduce, Hadoop, and Spark are popular systems for processing large amounts of data. The design of efficient algorithms in these frameworks is a challenging problem, as the systems both require…

Data Structures and Algorithms · Computer Science 2019-05-07 MohammadTaghi Hajiaghayi , Silvio Lattanzi , Saeed Seddighin , Cliff Stein

Parallel Processing of Large Graphs

More and more large data collections are gathered worldwide in various IT systems. Many of them possess the networked nature and need to be processed and analysed as graph structures. Due to their size they require very often usage of…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-06-04 Tomasz Kajdanowicz , Przemyslaw Kazienko , Wojciech Indyk

On the Computational Complexity of MapReduce

In this paper we study MapReduce computations from a complexity-theoretic perspective. First, we formulate a uniform version of the MRC model of Karloff et al. (2010). We then show that the class of regular languages, and moreover all of…

Computational Complexity · Computer Science 2015-10-07 Benjamin Fish , Jeremy Kun , Ádám Dániel Lelkes , Lev Reyzin , György Turán

Analyzing Large-Scale, Distributed and Uncertain Data

The exponential growth of data in current times and the demand to gain information and knowledge from the data present new challenges for database researchers. Known database systems and algorithms are no longer capable of effectively…

Databases · Computer Science 2017-12-06 Yaron Gonen

The Family of MapReduce and Large Scale Data Processing Systems

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a…

Databases · Computer Science 2013-02-14 Sherif Sakr , Anna Liu , Ayman G. Fayoumi

LLMapReduce: Multi-Level Map-Reduce for High Performance Data Analysis

The map-reduce parallel programming model has become extremely popular in the big data community. Many big data workloads can benefit from the enhanced performance offered by supercomputers. LLMapReduce provides the familiar map-reduce…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-12-13 Chansup Byun , Jeremy Kepner , William Arcand , David Bestor , Bill Bergeron , Vijay Gadepally , Matthew Hubbell , Peter Michaleas , Julie Mullen , Andrew Prout , Antonio Rosa , Charles Yee , Albert Reuther

An Alternative C++ based HPC system for Hadoop MapReduce

MapReduce is a technique used to vastly improve distributed processing of data and can massively speed up computation. Hadoop and its MapReduce relies on JVM and Java which is expensive on memory. High Performance Computing based MapReduce…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-29 Vignesh S. , Muthumanikandan V. , Siddarth S. , Sainath G

Communication vs Distributed Computation: an alternative trade-off curve

In this paper, we revisit the communication vs. distributed computing trade-off, studied within the framework of MapReduce in [1]. An implicit assumption in the aforementioned work is that each server performs all possible computations on…

Information Theory · Computer Science 2017-05-26 Yahya H. Ezzeldin , Mohammed Karmoose , Christina Fragouli

Massively Parallel Computation via Remote Memory Access

We introduce the Adaptive Massively Parallel Computation (AMPC) model, which is an extension of the Massively Parallel Computation (MPC) model. At a high level, the AMPC model strengthens the MPC model by storing all messages sent within a…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-21 Soheil Behnezhad , Laxman Dhulipala , Hossein Esfandiari , Jakub Łącki , Warren Schudy , Vahab Mirrokni

Round Compression for Parallel Matching Algorithms

For over a decade now we have been witnessing the success of {\em massive parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to…

Data Structures and Algorithms · Computer Science 2018-02-02 Artur Czumaj , Jakub Łącki , Aleksander Mądry , Slobodan Mitrović , Krzysztof Onak , Piotr Sankowski