Related papers: GPU Semiring Primitives for Sparse Neighborhood Me…

Sparse GPU Kernels for Deep Learning

Scientific workloads have traditionally exploited high levels of sparsity to accelerate computation and reduce memory requirements. While deep neural networks can be made sparse, achieving practical speedups on GPUs is difficult because…

Machine Learning · Computer Science 2020-09-02 Trevor Gale , Matei Zaharia , Cliff Young , Erich Elsen

Design Principles for Sparse Matrix Multiplication on the GPU

We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row (CSR) format and thus do not require expensive format conversion.…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-13 Carl Yang , Aydin Buluc , John D. Owens

Sparse within Sparse Gaussian Processes using Neighbor Information

Approximations to Gaussian processes based on inducing variables, combined with variational inference techniques, enable state-of-the-art sparse approaches to infer GPs at scale through mini batch-based learning. In this work, we address…

Machine Learning · Statistics 2021-07-21 Gia-Lac Tran , Dimitrios Milios , Pietro Michiardi , Maurizio Filippone

A New Sparse Matrix Vector Multiplication GPU Algorithm Designed for Finite Element Problems

Recently, graphics processors (GPUs) have been increasingly leveraged in a variety of scientific computing applications. However, architectural differences between CPUs and GPUs necessitate the development of algorithms that take advantage…

Mathematical Software · Computer Science 2015-01-05 Jonathan Wong , Ellen Kuhl , Eric Darve

Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that permits the use of a smaller network without sacrificing…

Computer Vision and Pattern Recognition · Computer Science 2022-05-05 Thomas Müller , Alex Evans , Christoph Schied , Alexander Keller

SigGPDE: Scaling Sparse Gaussian Processes on Sequential Data

Making predictions and quantifying their uncertainty when the input data is sequential is a fundamental learning challenge, recently attracting increasing attention. We develop SigGPDE, a new scalable sparse variational inference framework…

Machine Learning · Statistics 2021-10-13 Maud Lemercier , Cristopher Salvi , Thomas Cass , Edwin V. Bonilla , Theodoros Damoulas , Terry Lyons

Speeding Up Mixed-Integer Programming Solvers with Sparse Learning for Branching

Machine learning is increasingly used to improve decisions within branch-and-bound algorithms for mixed-integer programming. Many existing approaches rely on deep learning, which often requires very large training datasets and substantial…

Machine Learning · Computer Science 2026-04-02 Selin Bayramoğlu , George L Nemhauser , Nikolaos V Sahinidis

Accelerating Sparse Deep Neural Networks

As neural network model sizes have dramatically increased, so has the interest in various techniques to reduce their parameter counts and accelerate their execution. An active area of research in this field is sparsity - encouraging zero…

Machine Learning · Computer Science 2021-04-20 Asit Mishra , Jorge Albericio Latorre , Jeff Pool , Darko Stosic , Dusan Stosic , Ganesh Venkatesh , Chong Yu , Paulius Micikevicius

GPU-Accelerated Forward-Backward algorithm with Application to Lattice-Free MMI

We propose to express the forward-backward algorithm in terms of operations between sparse matrices in a specific semiring. This new perspective naturally leads to a GPU-friendly algorithm which is easy to implement in Julia or any…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-02 Lucas Ondel , Léa-Marie Lam-Yee-Mui , Martin Kocour , Caio Filippo Corro , Lukáš Burget

Variable noise and dimensionality reduction for sparse Gaussian processes

The sparse pseudo-input Gaussian process (SPGP) is a new approximation method for speeding up GP regression in the case of a large number of data points N. The approximation is controlled by the gradient optimization of a small set of M…

Machine Learning · Computer Science 2012-07-02 Edward Snelson , Zoubin Ghahramani

Timing and Memory Telemetry on GPUs for AI Governance

The rapid expansion of GPU-accelerated computing has enabled major advances in large-scale artificial intelligence (AI), while heightening concerns about how accelerators are observed or governed once deployed. Governance is essential to…

Cryptography and Security · Computer Science 2026-02-13 Saleh K. Monfared , Fatemeh Ganji , Dan Holcomb , Shahin Tajik

A Fully Sparse Implementation of a Primal-Dual Interior-Point Potential Reduction Method for Semidefinite Programming

In this paper, we show a way to exploit sparsity in the problem data in a primal-dual potential reduction method for solving a class of semidefinite programs. When the problem data is sparse, the dual variable is also sparse, but the primal…

Numerical Analysis · Mathematics 2025-10-20 Gun Srijuntongsiri , Stephen A. Vavasis

Highly Parallel Sparse Matrix-Matrix Multiplication

Generalized sparse matrix-matrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-09 Aydın Buluç , John R. Gilbert

Parallel GPU-Enabled Algorithms for SpGEMM on Arbitrary Semirings with Hybrid Communication

Sparse General Matrix Multiply (SpGEMM) is key for various High-Performance Computing (HPC) applications such as genomics and graph analytics. Using the semiring abstraction, many algorithms can be formulated as SpGEMM, allowing…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-23 Thomas McFarland , Julian Bellavita , Giulia Guidi

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Training neural network models with discrete (categorical or structured) latent variables can be computationally challenging, due to the need for marginalization over large or combinatorial sets. To circumvent this issue, one typically…

Machine Learning · Computer Science 2020-12-29 Gonçalo M. Correia , Vlad Niculae , Wilker Aziz , André F. T. Martins

Actually Sparse Variational Gaussian Processes

Gaussian processes (GPs) are typically criticised for their unfavourable scaling in both computational and memory requirements. For large datasets, sparse GPs reduce these demands by conditioning on a small set of inducing variables…

Machine Learning · Statistics 2023-04-12 Harry Jake Cunningham , Daniel Augusto de Souza , So Takao , Mark van der Wilk , Marc Peter Deisenroth

Effective implementation of the High Performance Conjugate Gradient benchmark on GraphBLAS

Applications in High-Performance Computing (HPC) environments face challenges due to increasing complexity. Among them, the increasing usage of sparse data pushes the limits of data structures and programming models and hampers the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-26 Alberto Scolari , Albert-Jan Yzelman

Pre-Defined Sparse Neural Networks with Hardware Acceleration

Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing…

Machine Learning · Computer Science 2024-10-30 Sourya Dey , Kuan-Wen Huang , Peter A. Beerel , Keith M. Chugg

Sparse Techniques for Regression in Deep Gaussian Processes

Gaussian processes (GPs) have gained popularity as flexible machine learning models for regression and function approximation with an in-built method for uncertainty quantification. However, GPs suffer when the amount of training data is…

Machine Learning · Statistics 2025-11-26 Jonas Latz , Aretha L. Teckentrup , Simon Urbainczyk

MemGS: Memory-Efficient Gaussian Splatting for Real-Time SLAM

Recent advancements in 3D Gaussian Splatting (3DGS) have made a significant impact on rendering and reconstruction techniques. Current research predominantly focuses on improving rendering performance and reconstruction quality using…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Yinlong Bai , Hongxin Zhang , Sheng Zhong , Junkai Niu , Hai Li , Yijia He , Yi Zhou