English

Connecting MapReduce Computations to Realistic Machine Models

Data Structures and Algorithms 2020-02-19 v1

Abstract

We explain how the popular, highly abstract MapReduce model of parallel computation (MRC) can be rooted in reality by explaining how it can be simulated on realistic distributed-memory parallel machine models like BSP. We first refine the model (MRC+^+) to include parameters for total work ww, bottleneck work w^\hat{w}, data volume mm, and maximum object sizes m^\hat{m}. We then show matching upper and lower bounds for executing a MapReduce calculation on the distributed-memory machine -- Θ(w/p+w^+logp)\Theta(w/p+\hat{w}+\log p) work and Θ(m/p+m^+logp)\Theta(m/p+\hat{m}+\log p) bottleneck communication volume using pp processors.

Keywords

Cite

@article{arxiv.2002.07553,
  title  = {Connecting MapReduce Computations to Realistic Machine Models},
  author = {Peter Sanders},
  journal= {arXiv preprint arXiv:2002.07553},
  year   = {2020}
}