Related papers: Optimizing Sparse Linear Algebra Through Automatic…
Sparse matrices and linear algebra are at the heart of scientific simulations. More than 70 sparse matrix storage formats have been developed over the years, targeting a wide range of hardware architectures and matrix types. Each format is…
Sparse matrix vector multiplication (SpMV) is an important kernel in scientific and engineering applications. The previous optimizations are sparse matrix format specific and expose the choice of the best format to application programmers.…
Sparse matrices and linear algebra are at the heart of scientific simulations. Over the years, more than 70 sparse matrix storage formats have been developed, targeting a wide range of hardware architectures and matrix types, each of which…
In this paper, a run-time auto-tuning method for performance parameters according to input matrices is proposed. RAO-SS (Run-time Auto-tuning Optimizer for Sparse Solvers), which is a prototype of auto-tuning software using the proposed…
Heterogeneous computing, which combines devices with different architectures, is rising in popularity, and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programing such…
Sparse matrix-matrix multiplication (SpGEMM) is a critical operation in numerous fields, including scientific computing, graph analytics, and deep learning. These applications exploit the sparsity of matrices to reduce storage and…
Least absolute shrinkage and selection operator (Lasso), a popular method for high-dimensional regression, is now used widely for estimating high-dimensional time series models such as the vector autoregression (VAR). Selecting its tuning…
Sparse matrix ordering is a vital optimization technique often employed for solving large-scale sparse matrices. Its goal is to minimize the matrix bandwidth by reorganizing its rows and columns, thus enhancing efficiency. Conventional…
Sparse linear algebra is central to many scientific programs, yet compilers fail to optimize it well. High-performance libraries are available, but adoption costs are significant. Moreover, libraries tie programs into vendor-specific…
Sparse storage formats are techniques for storing and processing the sparse matrix data efficiently. The performance of these storage formats depend upon the distribution of non-zeros, within the matrix in different dimensions. In order to…
MLtuner automatically tunes settings for training tunables (such as the learning rate, the momentum, the mini-batch size, and the data staleness bound) that have a significant impact on large-scale machine learning (ML) performance.…
Sparse models, including sparse Mixture-of-Experts (MoE) models, have emerged as an effective approach for scaling Transformer models. However, they often suffer from computational inefficiency since a significant number of parameters are…
Sparse linear iterative solvers are essential for many large-scale simulations. Much of the runtime of these solvers is often spent in the implicit evaluation of matrix polynomials via a sequence of sparse matrix-vector products. A variety…
Sparse matrix-vector multiplication (SpMV) is an essential linear algebra operation that dominates the computing cost in many scientific applications. Due to providing massive parallelism and high memory bandwidth, GPUs are commonly used to…
Large-scale machine learning (ML) models are increasingly being used in critical domains like education, lending, recruitment, healthcare, criminal justice, etc. However, the training, deployment, and utilization of these models demand…
Accelerators for sparse matrix multiplication are important components in emerging systems. In this paper, we study the main challenges of accelerating Sparse Matrix Multiplication (SpMM). For the situations that data is not stored in the…
Modern Machine Learning (ML) applications often benefit from structured sparsity, a technique that efficiently reduces model complexity and simplifies handling of sparse data in hardware. Sparse systolic tensor arrays - specifically…
Sparse coding--that is, modelling data vectors as sparse linear combinations of basis elements--is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the large-scale matrix factorization…
Sparse linear algebra kernels play a critical role in numerous applications, covering from exascale scientific simulation to large-scale data analytics. Offloading linear algebra kernels on one GPU will no longer be viable in these…
In this paper, we propose an optimization selection methodology for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel. We propose two models that attempt to identify the major performance bottleneck of the kernel for every…