Related papers: Supercomputer Environment for Recursive Matrix Alg…

Recursive matrix algorithms, distributed dynamic control, scaling, stability

The report is devoted to the concept of creating block-recursive matrix algorithms for computing on a supercomputer with distributed memory and dynamic decentralized control.

Symbolic Computation · Computer Science 2025-01-10 Gennadi Malaschonok

Recursive Matrix Algorithms in Commutative Domain for Cluster with Distributed Memory

We give an overview of the theoretical results for matrix block-recursive algorithms in commutative domains and present the results of experiments that we conducted with new parallel programs based on these algorithms on a supercomputer…

Symbolic Computation · Computer Science 2019-03-12 Gennadi Malaschonok , Evgeni Ilchenko

An Efficient Solver for Sparse Linear Systems Based on Rank-Structured Cholesky Factorization

Direct factorization methods for the solution of large, sparse linear systems that arise from PDE discretizations are robust, but typically show poor time and memory scalability for large systems. In this paper, we describe an efficient…

Numerical Analysis · Computer Science 2015-07-21 Jeffrey N. Chadwick , David S. Bindel

Memory-Usage Advantageous Block Recursive Matrix Inverse

The inversion of extremely high order matrices has been a challenging task because of the limited processing and memory capacity of conventional computers. In a scenario in which the data does not fit in memory, it is worth to consider…

Numerical Analysis · Mathematics 2018-05-08 Iria C. S. Cosme , Isaac F. Fernandes , João L. de Carvalho , Samuel Xavier-de-Souza

Make the most of what you have: Resource-efficient randomized algorithms for matrix computations

In recent years, randomized algorithms have established themselves as fundamental tools in computational linear algebra, with applications in scientific computing, machine learning, and quantum information science. Many randomized matrix…

Numerical Analysis · Mathematics 2025-12-19 Ethan N. Epperly

Compiling Recurrences over Dense and Sparse Arrays

Recurrence equations lie at the heart of many computational paradigms including dynamic programming, graph analysis, and linear solvers. These equations are often expensive to compute and much work has gone into optimizing them for…

Programming Languages · Computer Science 2023-09-12 Shiv Sundram , Muhammad Usman Tariq , Fredrik Kjolstad

DuctTeip: An efficient programming model for distributed task based parallel computing

Current high-performance computer systems used for scientific computing typically combine shared memory computational nodes in a distributed memory environment. Extracting high performance from these complex systems requires tailored…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-14 Afshin Zafari , Elisabeth Larsson , Martin Tillenius

Fast Sparse Matrix Permutation for Mesh-Based Direct Solvers

We present a fast sparse matrix permutation algorithm tailored to linear systems arising from triangle meshes. Our approach produces nested-dissection-style permutations while significantly reducing permutation runtime overhead. Rather than…

Graphics · Computer Science 2026-02-03 Behrooz Zarebavami , Ahmed H. Mahmoud , Ana Dodik , Changcheng Yuan , Serban D. Porumbescu , John D. Owens , Maryam Mehri Dehnavi , Justin Solomon

Recursive blocked algorithms for linear systems with Kronecker product structure

Recursive blocked algorithms have proven to be highly efficient at the numerical solution of the Sylvester matrix equation and its generalizations. In this work, we show that these algorithms extend in a seamless fashion to…

Numerical Analysis · Mathematics 2019-05-24 Minhong Chen , Daniel Kressner

Fast and Robust Recursive Algorithms for Separable Nonnegative Matrix Factorization

In this paper, we study the nonnegative matrix factorization problem under the separability assumption (that is, there exists a cone spanned by a small subset of the columns of the input nonnegative data matrix containing all columns),…

Machine Learning · Statistics 2014-04-07 Nicolas Gillis , Stephen A. Vavasis

GPU Accelerated Sparse Cholesky Factorization

The solution of sparse symmetric positive definite linear systems is an important computational kernel in large-scale scientific and engineering modeling and simulation. We will solve the linear systems using a direct method, in which a…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-13 M. Ozan Karsavuran , Esmond G. Ng , Barry W. Peyton

Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

Randomly pivoted Cholesky (RPCholesky) is an algorithm for constructing a low-rank approximation of a positive-semidefinite matrix using a small number of columns. This paper develops an accelerated version of RPCholesky that employs block…

Numerical Analysis · Mathematics 2025-04-08 Ethan N. Epperly , Joel A. Tropp , Robert J. Webber

Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures

The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile algorithms, has recently been introduced. Previous research…

Mathematical Software · Computer Science 2010-02-23 Emmanuel Agullo , Henricus Bouwmeester , Jack Dongarra , Jakub Kurzak , Julien Langou , Lee Rosenberg

Algorithm 979: Recursive Algorithms for Dense Linear Algebra -- The ReLAPACK Collection

To exploit both memory locality and the full performance potential of highly tuned kernels, dense linear algebra libraries such as LAPACK commonly implement operations as blocked algorithms. However, to achieve next-to-optimal performance…

Mathematical Software · Computer Science 2022-04-08 Elmar Peise , Paolo Bientinesi

Analysis of randomized CholeskyQR for sparse matrices

This work is about rounding error analysis of randomized CholeskyQR-type algorithms for sparse matrices. We often encounter QR factorization of the sparse matrices in many real problems. In this work, we focus on some typical…

Numerical Analysis · Mathematics 2025-11-10 Haoran Guan , Yuwei Fan

The Reverse Cuthill-McKee Algorithm in Distributed-Memory

Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-27 Ariful Azad , Mathias Jacquelin , Aydin Buluc , Esmond G. Ng

Scalable hierarchical parallel algorithm for the solution of super large-scale sparse linear equations

The parallel linear equations solver capable of effectively using 1000+ processors becomes the bottleneck of large-scale implicit engineering simulations. In this paper, we present a new hierarchical parallel master-slave-structural…

Computational Physics · Physics 2015-06-11 Ran Xu , Bin Liu , Yuan Dong

Blockwise inversion and algorithms for inverting large partitioned matrices

Block matrix structure is commonly arising is various physics and engineering applications. There are various advantages in preserving the blocks structure while computing the inversion of such partitioned matrices. In this context, using…

Numerical Analysis · Mathematics 2023-11-22 R. Thiru Senthil

Iterative Refinement and Flexible Iteratively Reweighed Solvers for Linear Inverse Problems with Sparse Solutions

This paper presents a new algorithmic framework for computing sparse solutions to large-scale linear discrete ill-posed problems. The approach is motivated by recent perspectives on iteratively reweighted norm schemes, viewed through the…

Numerical Analysis · Mathematics 2025-02-05 Lucas Onisk , Malena Sabaté Landman

Highly Scalable Multiplication for Distributed Sparse Multivariate Polynomials on Many-core Systems

We present a highly scalable algorithm for multiplying sparse multivariate polynomials represented in a distributed format. This algo- rithm targets not only the shared memory multicore computers, but also computers clusters or specialized…

Symbolic Computation · Computer Science 2013-04-01 Mickael Gastineau , Jacques Laskar