English
Related papers

Related papers: Exponential Integrators on Graphic Processing Unit…

200 papers

Stencil computations are widely used in HPC applications. Today, many HPC platforms use GPUs as accelerators. As a result, understanding how to perform stencil computations fast on GPUs is important. While implementation strategies for…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-16 Ryuichi Sai , John Mellor-Crummey , Xiaozhu Meng , Mauricio Araya-Polo , Jie Meng

We present an open-source CUDA-based package that consists of a compilation of exponential integrators where the action of the matrix exponential or the $\varphi_l$ functions on a vector is approximated using the method of polynomial…

Numerical Analysis · Mathematics 2024-10-23 Pranab J Deka , Alexander Moriggl , Lukas Einkemmer

The edge computing paradigm has emerged to handle cloud computing issues such as scalability, security and low response time among others. This new computing trend heavily relies on ubiquitous embedded systems on the edge. Performance and…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-28 Mohammad Hosseinabady , Mohd Amiruddin Bin Zainol , Jose Nunez-Yanez

Over the last ten years, graphics processors have become the de facto accelerator for data-parallel tasks in various branches of high-performance computing, including machine learning and computational sciences. However, with the recent…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-28 Johannes Pekkilä , Oskar Lappi , Fredrik Robertsén , Maarit J. Korpi-Lagg

In this thesis we develop techniques to efficiently solve numerical Partial Differential Equations (PDEs) using Graphical Processing Units (GPUs). Focus is put on both performance and re--usability of the methods developed, to this end a…

Numerical Analysis · Mathematics 2021-01-19 Andrew Gloster

Accelerated computing is widely used in high-performance computing. Therefore, it is crucial to experiment and discover how to better utilize GPUGPUs latest generations on relevant applications. In this paper, we present results and share…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-13 Baodi Shan , Mauricio Araya-Polo

Exponential Runge-Kutta methods are a well-established tool for the numerical integration of parabolic evolution equations. However, these schemes are typically developed under the assumption of homogeneous boundary conditions. In this…

Numerical Analysis · Mathematics 2025-10-27 Carlos Arranz-Simón , Alexander Ostermann

The simulation of the two-dimensional Ising model is used as a benchmark to show the computational capabilities of Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-26 Joshua Romero , Mauro Bisson , Massimiliano Fatica , Massimo Bernaschi

Block iterative methods are extremely important as smoothers for multigrid methods, as preconditioners for Krylov methods, and as solvers for diagonally dominant linear systems. Developing robust and efficient algorithms suitable for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-16 Manuel Birke , Bobby Philip , Zhen Wang , Mark Berrill

Stencils represent a class of computational patterns where an output grid point depends on a fixed shape of neighboring points in an input grid. Stencil computations are prevalent in scientific applications engaging a significant portion of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-24 Jesmin Jahan Tithi , Fabrizio Petrini , Hongbo Rong , Andrei Valentin , Carl Ebeling

In this paper, we propose a $\mu$-mode integrator for computing the solution of stiff evolution equations. The integrator is based on a $d$-dimensional splitting approach and uses exact (usually precomputed) one-dimensional matrix…

Numerical Analysis · Mathematics 2022-02-09 Marco Caliari , Fabio Cassini , Lukas Einkemmer , Alexander Ostermann , Franco Zivcovich

Matrix-accelerated stencil computation is a hot research topic, yet its application to three-dimensional (3D) high-order stencils and HPC remains underexplored. With the emergence of matrix units on multicore CPUs, we analyze matrix-based…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-16 Yinuo Wang , Tianqi Mao , Lin Gan , Wubing Wan , Zeyu Song , Jiayu Fu , Lanke He , Wenqiang Wang , Zekun Yin , Wei Xue , Guangwen Yang

Linear solvers are major computational bottlenecks in a wide range of decision support and optimization computations. The challenges become even more pronounced on heterogeneous hardware, where traditional sparse numerical linear algebra…

Computational Engineering, Finance, and Science · Computer Science 2024-01-26 Kasia Świrydowicz , Nicholson Koukpaizan , Maksudul Alam , Shaked Regev , Michael Saunders , Slaven Peleš

The paper considers the problem of implementation on graphics processors of numerical integration routines for higher order finite element approximations. The design of suitable GPU kernels is investigated in the context of general purpose…

Mathematical Software · Computer Science 2014-03-03 Krzysztof Banaś , Przemysław Płaszewski , Paweł Macioł

This paper proposes a versatile high-performance execution model, inspired by systolic arrays, for memory-bound regular kernels running on CUDA-enabled GPUs. We formulate a systolic model that shifts partial sums by CUDA warp primitives for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-09-09 Peng Chen , Mohamed Wahib , Shinichiro Takizawa , Ryousei Takano , Satoshi Matsuoka

Recent developments in High Level Synthesis tools have attracted software programmers to accelerate their high-performance computing applications on FPGAs. Even though it has been shown that FPGAs can compete with GPUs in terms of…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-16 Hamid Reza Zohouri , Artur Podobas , Satoshi Matsuoka

Optimizing the performance of stencil algorithms has been the subject of intense research over the last two decades. Since many stencil schemes have low arithmetic intensity, most optimizations focus on increasing the temporal data access…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-19 Tareq Malas , Georg Hager , Hatem Ltaief , David Keyes

We present a new adaptive parallel algorithm for the challenging problem of multi-dimensional numerical integration on massively parallel architectures. Adaptive algorithms have demonstrated the best performance, but efficient many-core…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-24 Ioannis Sakiotis , Kamesh Arumugam , Marc Paterno , Desh Ranjan , Balša Terzić , Mohammad Zubair

Iterative stencils are used widely across the spectrum of High Performance Computing (HPC) applications. Many efforts have been put into optimizing stencil GPU kernels, given the prevalence of GPU-accelerated supercomputers. To improve the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-15 Lingqi Zhang , Mohamed Wahib , Peng Chen , Jintao Meng , Xiao Wang , Toshio Endo , Satoshi Matsuoka

We present new algorithms for the parallelization of Eulerian-Lagrangian interaction operations in the immersed boundary method. Our algorithms rely on two well-studied parallel primitives: key-value sort and segmented reduce. The use of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-02 Andrew Kassen , Varun Shankar , Aaron L Fogelson
‹ Prev 1 2 3 10 Next ›