Related papers: A Bit-Compatible Shared Memory Parallelization for…
This article introduces HYLU, a hybrid parallel LU factorization-based general-purpose solver designed for efficiently solving sparse linear systems (Ax=b) on multi-core shared-memory architectures. The key technical feature of HYLU is the…
This paper presents a parallel preconditioning approach based on incomplete LU (ILU) factorizations in the framework of Domain Decomposition (DD) for general sparse linear systems. We focus on distributed memory parallel architectures,…
There are variety of computational algorithms need sequential sweeping; sweeping based on specific order; on a structured grid, e.g., preconditioning (smoothing) by SOR or ILU methods and solution of eikonal equation by fast sweeping…
The ILU-based preconditioning methods in previous work have their own parameters to improve their performances. Although the parameters may degrade the performance, their determination is left to users. Thus, these previous methods are not…
This research investigates the implementation mechanism of block-wise ILU(k) preconditioner on GPU. The block-wise ILU(k) algorithm requires both the level k and the block size to be designed as variables. A decoupled ILU(k) algorithm…
Incomplete factorization is a powerful preconditioner for Krylov subspace methods for solving large-scale sparse linear systems. Existing incomplete factorization techniques, including incomplete Cholesky and incomplete LU factorizations,…
In this work, we present a new scalable incomplete LU factorization framework called Javelin to be used as a preconditioner for solving sparse linear systems with iterative methods. Javelin allows for improved parallel factorization on…
In-memory computing (IMC) has been shown to be a promising approach for solving binary optimization problems while significantly reducing energy and latency. Building on the advantages of parallel computation, we propose an IMC-compatible…
This paper presents a hybrid CPU-GPU framework for solving combinatorial scheduling problems formulated as Integer Linear Programming (ILP). While scheduling underpins many optimization tasks in computing systems, solving these problems…
Incomplete factorization is a widely used preconditioning technique for Krylov subspace methods for solving large-scale sparse linear systems. Its multilevel variants, such as ILUPACK, are more robust for many symmetric or unsymmetric…
We present a study of the effectiveness of asynchronous incomplete LU factorization preconditioners for the time-implicit solution of compressible flow problems while exploiting thread-parallelism within a compute node. A block variant of…
Preconditioning for overdetermined least-squares problems has received comparatively little attention, and designing methods that are both effective and memory-efficient remains challenging. We propose a class of ILU-based preconditioners…
We describe an efficient parallel implementation of the selected inversion algorithm for distributed memory computer systems, which we call \texttt{PSelInv}. The \texttt{PSelInv} method computes selected elements of a general sparse matrix…
The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…
This system paper documents the technical foundations for the extension of the Topology ToolKit (TTK) to distributed-memory parallelism with the Message Passing Interface (MPI). While several recent papers introduced topology-based…
In recent decades, High Performance Computing (HPC) has undergone significant enhancements, particularly in the realm of hardware platforms, aimed at delivering increased processing power while keeping power consumption within reasonable…
Scalable sparse LU factorization is critical for efficient numerical simulation of circuits and electrical power grids. In this work, we present a new scalable sparse direct solver called Basker. Basker introduces a new algorithm to…
There has been a growing interest in parallel strategies for solving trajectory optimization problems. One key step in many algorithmic approaches to trajectory optimization is the solution of moderately-large and sparse linear systems.…
Integer Linear Programming (ILP) serves as a versatile framework for modeling a wide range of combinatorial optimization problems, typically addressed by sophisticated exact solvers or heuristics. While learning-based approaches have…
Sparse linear iterative solvers are essential for many large-scale simulations. Much of the runtime of these solvers is often spent in the implicit evaluation of matrix polynomials via a sequence of sparse matrix-vector products. A variety…