Related papers: Task-based, GPU-accelerated and Robust Library for…

Introduction to StarNEig -- A Task-based Library for Solving Nonsymmetric Eigenvalue Problems

In this paper, we present the StarNEig library for solving dense non-symmetric (generalized) eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-23 Mirko Myllykoski , Carl Christian Kjelgaard Mikkelsen

TensorNEAT: A GPU-accelerated Library for NeuroEvolution of Augmenting Topologies

The NeuroEvolution of Augmenting Topologies (NEAT) algorithm has received considerable recognition in the field of neuroevolution. Its effectiveness is derived from initiating with simple networks and incrementally evolving both their…

Neural and Evolutionary Computing · Computer Science 2025-04-14 Lishuang Wang , Mengfei Zhao , Enyu Liu , Kebin Sun , Ran Cheng

Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils

In this paper we consider the problem of computing generalized eigenvectors of a matrix pencil in real Schur form. In exact arithmetic, this problem can be solved using substitution. In practice, substitution is vulnerable to floating-point…

Mathematical Software · Computer Science 2020-03-23 Carl Christian Kjelgaard Mikkelsen , Mirko Myllykoski

NEP: a module for the parallel solution of nonlinear eigenvalue problems in SLEPc

SLEPc is a parallel library for the solution of various types of large-scale eigenvalue problems. In the last years we have been developing a module within SLEPc, called NEP, that is intended for solving nonlinear eigenvalue problems. These…

Mathematical Software · Computer Science 2021-06-29 Carmen Campos , Jose E. Roman

Porting a sparse linear algebra math library to Intel GPUs

With the announcement that the Aurora Supercomputer will be composed of general purpose Intel CPUs complemented by discrete high performance Intel GPUs, and the deployment of the oneAPI ecosystem, Intel has committed to enter the arena of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-19 Yuhsiang M. Tsai , Terry Cojean , Hartwig Anzt

High Performance Solution of Skew-symmetric Eigenvalue Problems with Applications in Solving the Bethe-Salpeter Eigenvalue Problem

We present a high-performance solver for dense skew-symmetric matrix eigenvalue problems. Our work is motivated by applications in computational quantum physics, where one solution approach to solve the so-called Bethe-Salpeter equation…

Numerical Analysis · Mathematics 2020-06-05 Carolin Penke , Andreas Marek , Christian Vorwerk , Claudia Draxl , Peter Benner

On the energy efficiency of sparse matrix computations on multi-GPU clusters

We investigate the energy efficiency of a library designed for parallel computations with sparse matrices. The library leverages high-performance, energy-efficient Graphics Processing Unit (GPU) accelerators to enable large-scale scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-16 Massimo Bernaschi , Alessandro Celestini , Pasqua D'Ambra , Giorgio Richelli

DBCSR: A Library for Dense Matrix Multiplications on Distributed GPU-Accelerated Systems

Most, if not all the modern scientific simulation packages utilize matrix algebra operations. Among the operation of the linear algebra, one of the most important kernels is the multiplication of matrices, dense and sparse. Examples of…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-14 Ilia Sivkov , Alfio Lazzaro , Juerg Hutter

Advancing the distributed Multi-GPU ChASE library through algorithm optimization and NCCL library

As supercomputers become larger with powerful Graphics Processing Unit (GPU), traditional direct eigensolvers struggle to keep up with the hardware evolution and scale efficiently due to communication and synchronization demands.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-28 Xinzhe Wu , Edoardo Di Napoli

dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures

A new scalable parallel math library, dMath, is presented in this paper that demonstrates leading scaling when using intranode, or internode, hybrid-parallelism for deep-learning. dMath provides easy-to-use distributed base primitives and a…

Neural and Evolutionary Computing · Computer Science 2016-04-07 Steven Eliuk , Cameron Upright , Anthony Skjellum

JAXMg: A multi-GPU linear solver in JAX

Solving large dense linear systems and eigenvalue problems is a core requirement in many areas of scientific computing, but scaling these operations beyond a single GPU remains challenging within modern programming frameworks. While highly…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-22 Roeland Wiersema

CLBlast: A Tuned OpenCL BLAS Library

This work introduces CLBlast, an open-source BLAS library providing optimized OpenCL routines to accelerate dense linear algebra for a wide variety of devices. It is targeted at machine learning and HPC applications and thus provides a fast…

Mathematical Software · Computer Science 2018-04-30 Cedric Nugteren

Enabling GPU Accelerated Computing in the SUNDIALS Time Integration Library

As part of the Exascale Computing Project (ECP), a recent focus of development efforts for the SUite of Nonlinear and DIfferential/ALgebraic equation Solvers (SUNDIALS) has been to enable GPU-accelerated time integration in scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-04 Cody J. Balos , David J. Gardner , Carol S. Woodward , Daniel R. Reynolds

AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration

GPUs are critical for compute-intensive applications, yet emerging workloads such as recommender systems, graph analytics, and data analytics often exceed GPU memory capacity. Existing solutions allow GPUs to use CPU DRAM or SSDs as…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-27 Zhuoping Yang , Jinming Zhuang , Xingzhen Chen , Alex K. Jones , Peipei Zhou

A Mixed Precision, Multi-GPU Design for Large-scale Top-K Sparse Eigenproblems

Graph analytics techniques based on spectral methods process extremely large sparse matrices with millions or even billions of non-zero values. Behind these algorithms lies the Top-K sparse eigenproblem, the computation of the largest…

Hardware Architecture · Computer Science 2022-01-20 Francesco Sgherzi , Alberto Parravicini , Marco Domenico Santambrogio

NEP-PACK: A Julia package for nonlinear eigenproblems - v0.2

We present NEP-PACK a novel open-source library for the solution of nonlinear eigenvalue problems (NEPs). The package provides a framework to represent NEPs, as well as efficient implementations of many state-of-the-art algorithms. The…

Numerical Analysis · Mathematics 2018-11-26 Elias Jarlebring , Max Bennedich , Giampaolo Mele , Emil Ringh , Parikshit Upadhyaya

Deep Graph Library Optimizations for Intel(R) x86 Architecture

The Deep Graph Library (DGL) was designed as a tool to enable structure learning from graphs, by supporting a core abstraction for graphs, including the popular Graph Neural Networks (GNN). DGL contains implementations of all core graph…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-14 Sasikanth Avancha , Vasimuddin Md , Sanchit Misra , Ramanarayan Mohanty

Hybrid programming-model strategies for GPU offloading of electronic structure calculation kernels

To address the challenge of performance portability, and facilitate the implementation of electronic structure solvers, we developed the Basic Matrix Library (BML) and Parallel, Rapid O(N) and Graph-based Recursive Electronic Structure…

Computational Physics · Physics 2024-01-26 Jean-Luc Fattebert , Christian F. A. Negre , Joshua Finkelstein , Jamaludin Mohd-Yusof , Daniel Osei-Kuffuor , Michael E. Wall , Yu Zhang , Nicolas Bock , Susan M. Mniszewski

An adaptive finite element multigrid solver using GPU acceleration

Adaptive finite elements combined with geometric multigrid solvers are one of the most efficient numerical methods for problems such as the instationary Navier-Stokes equations. Yet despite their efficiency, computations remain expensive…

Numerical Analysis · Mathematics 2025-12-23 Manuel Liebchen , Robert Jendersie , Utku Kaya , Christian Lessig , Thomas Richter

PSCToolkit: solving sparse linear systems with a large number of GPUs

In this chapter, we describe the Parallel Sparse Computation Toolkit (PSCToolkit), a suite of libraries for solving large-scale linear algebra problems in an HPC environment. In particular, we focus on the tools provided for the solution of…

Numerical Analysis · Mathematics 2025-01-09 Pasqua D'Ambra , Fabio Durastante , Salvatore Filippone