Related papers: Sparse Matrix Multiplication On An Associative Pro…
Linear-scaling electronic-structure techniques, also called O(N) techniques, rely heavily on the multiplication of sparse matrices, where the sparsity arises from spatial cut-offs. In order to treat very large systems, the calculations must…
Multiplication of a sparse matrix with another (dense or sparse) matrix is a fundamental operation that captures the computational patterns of many data science applications, including but not limited to graph algorithms, sparsely connected…
Generalized sparse matrix-matrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an…
We consider the problem of sparse matrix multiplication by the column row method in a distributed setting where the matrix product is not necessarily sparse. We present a surprisingly simple method for "consistent" parallel processing of…
This paper describes REAP, a software-hardware approach that enables high performance sparse linear algebra computations on a cooperative CPU-FPGA platform. REAP carefully separates the task of organizing the matrix elements from the…
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we…
We propose a novel approach to iterated sparse matrix dense matrix multiplication, a fundamental computational kernel in scientific computing and graph neural network training. In cases where matrix sizes exceed the memory of a single…
Accelerators for sparse matrix multiplication are important components in emerging systems. In this paper, we study the main challenges of accelerating Sparse Matrix Multiplication (SpMM). For the situations that data is not stored in the…
We develop and implement in this paper a fast sparse assembly algorithm, the fundamental operation which creates a compressed matrix from raw index data. Since it is often a quite demanding and sometimes critical operation, it is of…
We present an algorithm to reduce the computational effort for the multiplication of a given matrix with an unknown column vector. The algorithm decomposes the given matrix into a product of matrices whose entries are either zero or integer…
We introduce an algorithm for efficiently representing convolution with zero-padding and stride as a sparse transformation matrix, applied to a vectorized input through sparse matrix-vector multiplication (SpMV). We provide a theoretical…
This paper presents an efficient technique for matrix-vector and vector-transpose-matrix multiplication in distributed-memory parallel computing environments, where the matrices are unstructured, sparse, and have a substantially larger…
In recent years, the fervent demand for computational power across various domains has prompted hardware manufacturers to introduce specialized computing hardware aimed at enhancing computational capabilities. Particularly, the utilization…
We design and develop a work-efficient multithreaded algorithm for sparse matrix-sparse vector multiplication (SpMSpV) where the matrix, the input vector, and the output vector are all sparse. SpMSpV is an important primitive in the…
Achieving high efficiency with numerical kernels for sparse matrices is of utmost importance, since they are part of many simulation codes and tend to use most of the available compute time and resources. In addition, especially in large…
Dense and sparse tensors allow the representation of most bulk data structures in computational science applications. We show that sparse tensor algebra can also be used to express many of the transformations on these datasets, especially…
Generalized sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. Here we show that SpGEMM also yields efficient…
The sparse matrix-vector (SpMV) multiplication is an important computational kernel, but it is notoriously difficult to execute efficiently. This paper investigates algorithm performance for unstructured sparse matrices, which are more…
Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since…
Scaling up the sparse matrix-vector multiplication kernel on modern Graphics Processing Units (GPU) has been at the heart of numerous studies in both academia and industry. In this article we present a novel non-parametric, self-tunable,…