English

Introducing a Performance Model for Bandwidth-Limited Loop Kernels

Performance 2009-05-07 v1 Hardware Architecture

Abstract

We present a performance model for bandwidth limited loop kernels which is founded on the analysis of modern cache based microarchitectures. This model allows an accurate performance prediction and evaluation for existing instruction codes. It provides an in-depth understanding of how performance for different memory hierarchy levels is made up. The performance of raw memory load, store and copy operations and a stream vector triad are analyzed and benchmarked on three modern x86-type quad-core architectures in order to demonstrate the capabilities of the model.

Keywords

Cite

@article{arxiv.0905.0792,
  title  = {Introducing a Performance Model for Bandwidth-Limited Loop Kernels},
  author = {Jan Treibig and Georg Hager},
  journal= {arXiv preprint arXiv:0905.0792},
  year   = {2009}
}

Comments

8 pages

R2 v1 2026-06-21T12:58:45.594Z