Related papers: Multi-GPU aggregation-based AMG preconditioner for…
Hybrid CPU-GPU algorithms for Algebraic Multigrid methods (AMG) to efficiently utilize both CPU and GPU resources are presented. In particular, hybrid AMG framework focusing on minimal utilization of GPU memory with performance on par with…
We describe main issues and design principles of an efficient implementation, tailored to recent generations of Nvidia Graphics Processing Units (GPUs), of an Algebraic Multigrid (AMG) preconditioner previously proposed by one of the…
This research introduce our work on developing Krylov subspace and AMG solvers on NVIDIA GPUs. As SpMV is a crucial part for these iterative methods, SpMV algorithms for single GPU and multiple GPUs are implemented. A HEC matrix format and…
We present an efficient, robust and fully GPU-accelerated aggregation-based algebraic multigrid preconditioning technique for the solution of large sparse linear systems. These linear systems arise from the discretization of elliptic PDEs.…
Linear solvers for large and sparse systems are a key element of scientific applications, and their efficient implementation is necessary to harness the computational power of current computers. Algebraic MultiGrid (AMG) preconditioners are…
The numerical simulation of structural mechanics applications via finite elements usually requires the solution of large-size and ill-conditioned linear systems, especially when accurate results are sought for derived variables interpolated…
Isogeometric analysis (IgA) offers enhanced approximation capabilities for the discretization of elliptic boundary-value problems, yet it results in large, sparse, and increasingly ill-conditioned linear systems due to higher…
Multigrid solvers are the standard in modern scientific computing simulations. Domain Decomposition Aggregation-Based Algebraic Multigrid, also known as the DD-$\alpha$AMG solver, is a successful realization of an algebraic multigrid solver…
The paper presents AMGCL -- an opensource C++ library implementing the algebraic multigrid method (AMG) for solution of large sparse linear systems of equations, usually arising from discretization of partial differential equations on an…
Large-scale numerical simulations often come at the expense of daunting computations. High-Performance Computing has enhanced the process, but adapting legacy codes to leverage parallel GPU computations remains challenging. Meanwhile,…
Sparse linear system solvers are computationally expensive kernels that lie at the heart of numerous applications. This paper proposes a flexible preconditioning framework to substantially reduce the time and energy requirements of this…
We present the design and optimization of a linear solver on General Purpose GPUs for the efficient and high-throughput evaluation of the marginalized graph kernel between pairs of labeled graphs. The solver implements a preconditioned…
Solving large dense linear systems and eigenvalue problems is a core requirement in many areas of scientific computing, but scaling these operations beyond a single GPU remains challenging within modern programming frameworks. While highly…
We describe the GPU implementation of shifted or multimass iterative solvers for sparse linear systems of the sort encountered in lattice gauge theory. We provide a generic tool that can be used by those without GPU programming experience…
In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening…
The present work describes the development of heterogeneous GPGPU implicit CFD coupled solvers, encompassing both density- and pressure- based approaches. In this setup, the assembled linear matrix is offloaded onto multiple GPUs using…
This paper presents a parallel preconditioning approach based on incomplete LU (ILU) factorizations in the framework of Domain Decomposition (DD) for general sparse linear systems. We focus on distributed memory parallel architectures,…
The paper proposes a combination of the subdomain deflation method and local algebraic multigrid as a scalable distributed memory preconditioner that is able to solve large linear systems of equations. The implementation of the algorithm is…
Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems…
This paper presents our work on designing scalable linear solvers for large-scale reservoir simulations. The main objective is to support implementation of parallel reservoir simulators on distributed-memory parallel systems, where MPI…