English

Inductive Loop Analysis for Practical HPC Application Optimization

Distributed, Parallel, and Cluster Computing 2025-11-11 v1 Performance

Abstract

Scientific computing applications heavily rely on multi-level loop nests operating on multidimensional arrays. This presents multiple optimization opportunities from exploiting parallelism to reducing data movement through prefetching and improved register usage. HPC frameworks often delegate fine-grained data movement optimization to compilers, but their low-level representations hamper analysis of common patterns, such as strided data accesses and loop-carried dependencies. In this paper, we introduce symbolic, inductive loop optimization (SILO), a novel technique that models data accesses and dependencies as functions of loop nest strides. This abstraction enables the automatic parallelization of sequentially-dependent loops, as well as data movement optimizations including software prefetching and pointer incrementation to reduce register spills. We demonstrate SILO on fundamental kernels from scientific applications with a focus on atmospheric models and numerical solvers, achieving up to 12×\times speedup over the state of the art.

Keywords

Cite

@article{arxiv.2511.06052,
  title  = {Inductive Loop Analysis for Practical HPC Application Optimization},
  author = {Philipp Schaad and Tal Ben-Nun and Patrick Iff and Torsten Hoefler},
  journal= {arXiv preprint arXiv:2511.06052},
  year   = {2025}
}