English
Related papers

Related papers: Sparsity-Specific Code Optimization using Expressi…

200 papers

Sparse Triangular Solve (SpTRSV) is an important computational kernel used in the solution of sparse linear algebra systems in many scientific and engineering applications. It is diffcult to parallelize SpTRSV in today's architectures. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-23 Buse Yilmaz

Leveraging spatial sparsity has become a popular approach to accelerate 3D computer graphics applications. Spatially sparse data structures and efficient sparse kernels (such as parallel stencil operations on active voxels), are key to…

Programming Languages · Computer Science 2021-06-23 Yuanming Hu , Mingkuan Xu , Ye Kuang , Frédo Durand

We present parallel algorithms for constructing and traversing sparse octrees on graphics processing units (GPUs). The algorithms are based on parallel-scan and sort methods. To test the performance and feasibility, we implemented them in…

Instrumentation and Methods for Astrophysics · Physics 2012-04-11 Jeroen Bédorf , Evghenii Gaburov , Simon Portegies Zwart

Registers are the fastest memory components within the GPU's complex memory hierarchy, accessed by names rather than addresses. They are managed entirely by the compiler through a process called register allocation, during which the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-28 Deniz Elbek , Kamer Kaya

Programming high-performance sparse GPU kernels is notoriously difficult, requiring both substantial effort and deep expertise. Sparse compilers aim to simplify this process, but existing systems fall short in two key ways. First, they are…

Programming Languages · Computer Science 2025-10-21 Jaeyeon Won , Willow Ahrens , Joel S. Emer , Saman Amarasinghe

Bringing high-level machine learning models to efficient and well-suited machine implementations often invokes a bunch of tools, e.g.~code generators, compilers, and optimizers. Along such tool chains, abstractions have to be applied. This…

Machine Learning · Computer Science 2024-04-11 Daniel Biebert , Christian Hakert , Kuan-Hsun Chen , Jian-Jia Chen

Sparse graphs are ubiquitous in real and virtual worlds. With the phenomenal growth in semi-structured and unstructured data, sizes of the underlying graphs have witnessed a rapid growth over the years. Analyzing such large structures…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-08 Ashwina Kumar , M. Venkata Krishna , Prasanna Bartakke , Rahul Kumar , Rajesh Pandian M , Nibedita Behera , Rupesh Nasre

Scientific workloads have traditionally exploited high levels of sparsity to accelerate computation and reduce memory requirements. While deep neural networks can be made sparse, achieving practical speedups on GPUs is difficult because…

Machine Learning · Computer Science 2020-09-02 Trevor Gale , Matei Zaharia , Cliff Young , Erich Elsen

As neural network model sizes have dramatically increased, so has the interest in various techniques to reduce their parameter counts and accelerate their execution. An active area of research in this field is sparsity - encouraging zero…

Distributed algorithms are often beset by the straggler effect, where the slowest compute nodes in the system dictate the overall running time. Coding-theoretic techniques have been recently proposed to mitigate stragglers via algorithmic…

Machine Learning · Statistics 2017-11-21 Zachary Charles , Dimitris Papailiopoulos , Jordan Ellenberg

Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous than todays systems to provide more performance and energy efficiency. Thus, GPUs are increasingly being used to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-18 Saeed Taheri , Apan Qasem , Martin Burtscher

This work deals with the optimization of computer programs targeting Graphics Processing Units (GPUs). The goal is to lift, from programmers to optimizing compilers, the heavy burden of determining program details that are dependent on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Xiaohui Chen , Marc Moreno-Maza , Jeeva Paudel , Ning Xie

Artificial intelligence workloads, especially transformer models, exhibit emergent sparsity in which computations perform selective sparse access to dense data. The workloads are inefficient on hardware designed for dense computations and…

Data Structures and Algorithms · Computer Science 2024-02-23 Brian Wheatman , Meghana Madhyastha , Randal Burns

Recurrence equations lie at the heart of many computational paradigms including dynamic programming, graph analysis, and linear solvers. These equations are often expensive to compute and much work has gone into optimizing them for…

Programming Languages · Computer Science 2023-09-12 Shiv Sundram , Muhammad Usman Tariq , Fredrik Kjolstad

Generation of optimal codes is a well known problem in coding theory. Many computational approaches exist in the literature for finding record breaking codes. However generating codes with long lengths $n$ using serial algorithms is…

Information Theory · Computer Science 2015-07-21 Srajan Paliwal , Saurabh Tiwary , Bhaskar Chaudhury , Manish K. Gupta

This paper shows how to generate efficient tensor algebra code that compute on dynamic sparse tensors, which have sparsity structures that evolve over time. We propose a language for precisely specifying recursive, pointer-based data…

Mathematical Software · Computer Science 2021-12-03 Stephen Chou , Saman Amarasinghe

This paper addresses spatial programming of sparse matrix computations for productive performance. The challenge is how to express an irregular computation and its optimizations in a regular way. A sparse matrix has (non-zero) values and a…

Mathematical Software · Computer Science 2018-10-18 Hongbo Rong

Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-18 Kazuaki Matsumura , Simon Garcia De Gonzalo , Antonio J. Peña

Sparse codes in neuroscience have been suggested to offer certain computational advantages over other neural representations of sensory data. To explore this viewpoint, a sparse code is used to represent natural images in an optimal control…

Machine Learning · Computer Science 2021-01-08 Peter N. Loxley

We investigate the energy efficiency of a library designed for parallel computations with sparse matrices. The library leverages high-performance, energy-efficient Graphics Processing Unit (GPU) accelerators to enable large-scale scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-16 Massimo Bernaschi , Alessandro Celestini , Pasqua D'Ambra , Giorgio Richelli
‹ Prev 1 2 3 10 Next ›