Staging Blocked Evaluation over Structured Sparse Matrices
Abstract
The matrices used in many computational settings are naturally sparse, holding a small percentage of nonzero elements. Storing such matrices in specialized sparse formats enables algorithms that avoid wasting computation on zeros, significantly accelerating common matrix computations like sparse matrix-vector multiplication (SpMV) and sparse matrix-matrix multiplication (SpMM). In many real-world sparse matrices, however, nonzero elements are densely clustered in subregions of the matrix. For matrices that feature this sort of structured sparsity, hybrid formats can further improve performance by representing these subregions as dense blocks. Existing hybrid formats either fix the dimensions of dense blocks, padding irregular regions with zeros and wasting computation, or incur run-time overhead when iterating over variable-sized blocks. This paper presents SABLE, a framework for accelerating structured sparse matrix computations by using staging to achieve the best of both of these approaches. Ahead of execution, SABLE inspects the matrix to identify variable-sized dense subregions, which it stores using a new hybrid format. It then eliminates the overhead typically associated with variable-sized blocks by using staging to generate specialized code that is amenable to vectorization. We evaluate SABLE on SpMV and SpMM kernels using matrices from the popular SuiteSparse data set. SABLE outperforms the best available SpMV baseline by 10\% on average, and SpMM baselines by 20\%. When parallelized, SABLE achieves further speedups of up to on SpMV and SpMM over the best fully-sparse baseline when using 8 threads.
Cite
@article{arxiv.2407.00829,
title = {Staging Blocked Evaluation over Structured Sparse Matrices},
author = {Pratyush Das and Amirhossein Basareh and Adhitha Dias and Artem Pelenitsyn and Kirshanthan Sundararajah and Milind Kulkarni and Ben Delaware},
journal= {arXiv preprint arXiv:2407.00829},
year = {2026}
}