Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra

Paul Scheffler; Florian Zaruba; Fabian Schuiki; Torsten Hoefler; Luca Benini

Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra

Hardware Architecture 2020-12-15 v2

Authors: Paul Scheffler , Florian Zaruba , Fabian Schuiki , Torsten Hoefler , Luca Benini

Abstract

Sparse-dense linear algebra is crucial in many domains, but challenging to handle efficiently on CPUs, GPUs, and accelerators alike; multiplications with sparse formats like CSR and CSF require indirect memory lookups. In this work, we enhance a memory-streaming RISC-V ISA extension to accelerate sparse-dense products through streaming indirection. We present efficient dot, matrix-vector, and matrix-matrix product kernels using our hardware, enabling single-core FPU utilizations of up to 80% and speedups of up to 7.2x over an optimized baseline without extensions. A matrix-vector implementation on a multi-core cluster is up to 5.8x faster and 2.7x more energy-efficient with our kernels than an optimized baseline. We propose further uses for our indirection hardware, such as scatter-gather operations and codebook decoding, and compare our work to state-of-the-art CPU, GPU, and accelerator approaches, measuring a 2.8x higher peak FP64 utilization in CSR matrix-vector multiplication than a GTX 1080 Ti GPU running a cuSPARSE kernel.

Keywords

sparse matrix multiplication computer architecture gpu computing

Cite

@article{arxiv.2011.08070,
  title  = {Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra},
  author = {Paul Scheffler and Florian Zaruba and Fabian Schuiki and Torsten Hoefler and Luca Benini},
  journal= {arXiv preprint arXiv:2011.08070},
  year   = {2020}
}

Comments

6 pages, 4 figures. Submitted to DATE 2021. Camera-ready version

Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra

Abstract

Keywords

Cite

Comments

Related papers