Related papers: The N-shaped partition method: A novel parallel im…

A Batched GPU Methodology for Numerical Solutions of Partial Differential Equations

In this paper we present a methodology for data accesses when solving batches of Tridiagonal and Pentadiagonal matrices that all share the same left-hand-side (LHS) matrix. The intended application is to the numerical solution of Partial…

Computational Physics · Physics 2021-07-13 Enda Carroll , Andrew Gloster , Miguel D. Bustamante , Lennon Ó' Náraigh

GPU-Accelerated Cholesky Factorization of Block Tridiagonal Matrices

This paper presents a GPU-accelerated framework for solving block tridiagonal linear systems that arise naturally in numerous real-time applications across engineering and scientific computing. Through a multi-stage permutation strategy…

Optimization and Control · Mathematics 2026-01-08 Roland Schwan , Daniel Kuhn , Colin N. Jones

Scalable Edge Partitioning

Edge-centric distributed computations have appeared as a recent technique to improve the shortcomings of think-like-a-vertex algorithms on large scale-free networks. In order to increase parallelism on this model, edge partitioning -…

Data Structures and Algorithms · Computer Science 2018-10-12 Sebastian Schlag , Christian Schulz , Daniel Seemaier , Darren Strash

An overlapping domain decomposition splitting algorithm for stochastic nonlinear Schroedinger equation

A novel overlapping domain decomposition splitting algorithm based on a Crank-Nisolson method is developed for the stochastic nonlinear Schroedinger equation driven by a multiplicative noise with non-periodic boundary conditions. The…

Numerical Analysis · Mathematics 2023-09-08 Lihai Ji

Harnessing Batched BLAS/LAPACK Kernels on GPUs for Parallel Solutions of Block Tridiagonal Systems

Block-tridiagonal systems are prevalent in state estimation and optimal control, and solving these systems is often the computational bottleneck. Improving the underlying solvers therefore has a direct impact on the real-time performance of…

Mathematical Software · Computer Science 2025-12-05 David Jin , Alexis Montoison , Sungho Shin

Distributed Edge Partitioning for Trillion-edge Graphs

We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic,…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-09-24 Masatoshi Hanai , Toyotaro Suzumura , Wen Jun Tan , Elvis Liu , Georgios Theodoropoulos , Wentong Cai

A Highly Scalable TDMA for GPUs and Its Application to Flow Solver Optimization

A tridiagonal matrix algorithm (TDMA), Pipelined-TDMA, is developed for multi-GPU systems to resolve the scalability bottlenecks caused by the sequential structure of conventional divide-and-conquer TDMA. The proposed method pipelines…

Computational Physics · Physics 2025-09-05 Seungchan Kim , Jihoo Kim , Sanghyun Ha , Donghyun You

New features of parallel implementation of N-body problems on GPU

This paper focuses on the parallel implementation of a direct $N$-body method~(particle-particle algorithm) and the application of multiple GPUs for galactic dynamics simulations. Application of a hybrid OpenMP-CUDA technology is considered…

Computational Physics · Physics 2018-03-06 S. S. Khrapov , S. A. Khoperskov , A. V. Khoperskov

A Distributed Algorithm for Multi-scale Multi-stage Stochastic Programs with Application to Electricity Capacity Expansion

This paper applies the N-block PCPM algorithm to solve multi-scale multi-stage stochastic programs, with the application to electricity capacity expansion models. Numerical results show that the proposed simplified N-block PCPM algorithm,…

Optimization and Control · Mathematics 2021-03-29 Run Chen , Andrew L. Liu

A splitting algorithm for constrained optimization problems with parabolic equations

In this paper, an efficient parallel splitting method is proposed for the optimal control problem with parabolic equation constraints. The linear finite element is used to approximate the state variable and the control variable in spatial…

Optimization and Control · Mathematics 2023-02-21 Haiming Song , Jiachuan Zhang , Yongle Hao

A time-optimal algorithm for solving (block-)tridiagonal linear systems of dimension N on a distributed computer of N nodes

We are concerned with the fastest possible direct numerical solution algorithm for a thin-banded or tridiagonal linear system of dimension $N$ on a distributed computing network of $N$ nodes that is connected in a binary communication tree.…

Numerical Analysis · Mathematics 2018-02-02 Martin Neuenhofen

GPU Methodologies for Numerical Partial Differential Equations

In this thesis we develop techniques to efficiently solve numerical Partial Differential Equations (PDEs) using Graphical Processing Units (GPUs). Focus is put on both performance and re--usability of the methods developed, to this end a…

Numerical Analysis · Mathematics 2021-01-19 Andrew Gloster

Hypergraph Partitioning on GPU with Distinct Incident Hyperedges and Size Constraints

Hypergraph partitioning is a recurring NP-hard problem in engineering; its efficient solution at scale hinges on parallelism. This work proposes a GPU-centric algorithm for multi-level hypergraph partitioning aimed at a specific set of…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-21 Marco Ronzani , Cristina Silvano

Parallel algorithms in linear algebra

This report provides an introduction to algorithms for fundamental linear algebra problems on various parallel computer architectures, with the emphasis on distributed-memory MIMD machines. To illustrate the basic concepts and key issues,…

Data Structures and Algorithms · Computer Science 2015-03-17 Richard P. Brent

Fractional Crank-Nicolson-Galerkin finite element methods for nonlinear time fractional parabolic problems with time delay

A linearized numerical scheme is proposed to solve the nonlinear time fractional parabolic problems with time delay. The scheme is based on the standard Galerkin finite element method in the spatial direction, the fractional Crank-Nicolson…

Numerical Analysis · Mathematics 2021-09-10 Lili Li , Mianfu She , Yuanling Niu

ML-Based Optimum Number of CUDA Streams for the GPU Implementation of the Tridiagonal Partition Method

This paper presents a heuristic for finding the optimum number of CUDA streams by using tools common to the modern AI-oriented approaches and applied to the parallel partition algorithm. A time complexity model for the GPU realization of…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-22 Milena Veneva , Toshiyuki Imamura

A GPU-accelerated Cartesian grid method is proposed for solving the heat, wave, and Schrodinger equations on irregular domains

This paper introduces a second-order method for solving general elliptic partial differential equations (PDEs) on irregular domains using GPU acceleration, based on Ying's kernel-free boundary integral (KFBI) method. The method addresses…

Numerical Analysis · Mathematics 2024-04-24 Liwei Tan , Minsheng Huang , Wenjun Ying

Parallel algorithms for problems of cluster analysis with very large amount of data

In this paper we solve on GPUs massive problems with large amount of data, which are not appropriate for solution with the SIMD technology. For the given problem we consider a three-level parallelization. The multithreading of CPU is used…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-02-18 Natalya Litvinenko

A general-purpose hierarchical mesh partitioning method with node balancing strategies for large-scale numerical simulations

Large-scale parallel numerical simulations are essential for a wide range of engineering problems that involve complex, coupled physical processes interacting across a broad range of spatial and temporal scales. The data structures involved…

Mathematical Software · Computer Science 2018-10-11 Fande Kong , Roy H. Stogner , Derek R. Gaston , John W. Peterson , Cody J. Permann , Andrew E. Slaughter , Richard C. Martineau

An initial investigation of the performance of GPU-based swept time-space decomposition

Simulations of physical phenomena are essential to the expedient design of precision components in aerospace and other high-tech industries. These phenomena are often described by mathematical models involving partial differential equations…

Computational Physics · Physics 2017-01-05 Daniel Magee , Kyle E Niemeyer