English

A parallel pattern for iterative stencil + reduce

Distributed, Parallel, and Cluster Computing 2016-09-16 v1

Abstract

We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop in both data-parallel and streaming applications, or a combination of both. The pattern makes it possible to deploy a single stencil computation kernel on different GPUs. We discuss the implementation of Loop-of-stencil-reduce in FastFlow, a framework for the implementation of applications based on the parallel patterns. Experiments are presented to illustrate the use of Loop-of-stencil-reduce in developing data-parallel kernels running on heterogeneous systems.

Keywords

Cite

@article{arxiv.1609.04567,
  title  = {A parallel pattern for iterative stencil + reduce},
  author = {M. Aldinucci and M. Danelutto and M. Drocco and P. Kilpatrick and C. Misale and G. Peretti Pezzi and M. Torquati},
  journal= {arXiv preprint arXiv:1609.04567},
  year   = {2016}
}