English

Efficient multicore-aware parallelization strategies for iterative stencil computations

Performance 2012-03-01 v1 Distributed, Parallel, and Cluster Computing

Abstract

Stencil computations consume a major part of runtime in many scientific simulation codes. As prototypes for this class of algorithms we consider the iterative Jacobi and Gauss-Seidel smoothers and aim at highly efficient parallel implementations for cache-based multicore architectures. Temporal cache blocking is a known advanced optimization technique, which can reduce the pressure on the memory bus significantly. We apply and refine this optimization for a recently presented temporal blocking strategy designed to explicitly utilize multicore characteristics. Especially for the case of Gauss-Seidel smoothers we show that simultaneous multi-threading (SMT) can yield substantial performance improvements for our optimized algorithm.

Keywords

Cite

@article{arxiv.1004.1741,
  title  = {Efficient multicore-aware parallelization strategies for iterative stencil computations},
  author = {Jan Treibig and Gerhard Wellein and Georg Hager},
  journal= {arXiv preprint arXiv:1004.1741},
  year   = {2012}
}

Comments

15 pages, 10 figures

R2 v1 2026-06-21T15:08:53.659Z