English
Related papers

Related papers: Parallel QR Factorization of Block Low-Rank Matric…

200 papers

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Numerical Analysis · Mathematics 2008-08-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra

This paper describes a new QR factorization algorithm which is especially designed for massively parallel platforms combining parallel distributed multi-core nodes. These platforms make the present and the foreseeable future of…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-08-27 Jack Dongarra , Mathieu Faverge , Thomas Herault , Julien Langou , and Yves Robert

We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform, and just as stable as Householder QR. Our first algorithm, Tall Skinny…

Numerical Analysis · Computer Science 2008-08-29 James Demmel , Laura Grigori , Mark Hoemmen , Julien Langou

The unpivoted and pivoted Householder QR factorizations are ubiquitous in numerical linear algebra. A difficulty with pivoted Householder QR is the communication bottleneck introduced by pivoting. In this paper we propose using random…

Numerical Analysis · Mathematics 2017-03-08 Stephen Becker , James Folberth , Laura Grigori

In this work, we develop a new fast algorithm, spaQR -- sparsified QR, for solving large, sparse linear systems. The key to our approach is using low-rank approximations to sparsify the separators in a Nested Dissection based Householder QR…

Numerical Analysis · Mathematics 2020-10-15 Abeynaya Gnanasekaran , Eric Darve

Recent advances in transformer-based foundation models have made them the default choice for many tasks, but their rapidly growing size makes fitting a full model on a single GPU increasingly difficult and their computational cost…

Machine Learning · Computer Science 2026-01-21 Pierre Abillama , Changwoo Lee , Juechu Dong , David Blaauw , Dennis Sylvester , Hun-Seok Kim

A fundamental problem when adding column pivoting to the Householder QR factorization is that only about half of the computation can be cast in terms of high performing matrix-matrix multiplications, which greatly limits the benefits that…

Numerical Analysis · Mathematics 2016-12-08 Per-Gunnar Martinsson , Gregorio Quintana-Orti , Nathan Heavner , Robert van de Geijn

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Mathematical Software · Computer Science 2008-06-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra

Efficient task scheduling is paramount in parallel programming on multi-core architectures, where tasks are fundamental computational units. QR factorization is a critical sub-routine in Sequential Least Squares Quadratic Programming…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-12 Soumyajit Chatterjee , Rahul Utkoor , Uppu Eshwar , Sathya Peri , V. Krishna Nandivada

The dominant contribution to communication complexity in factorizing a matrix using QR with column pivoting is due to column-norm updates that are required to process pivot decisions. We use randomized sampling to approximate this process…

Numerical Analysis · Mathematics 2018-01-23 Jed A. Duersch , Ming Gu

We present parallel and sequential dense QR factorization algorithms for tall and skinny matrices and general rectangular matrices that both minimize communication, and are as stable as Householder QR. The sequential and parallel algorithms…

Numerical Analysis · Mathematics 2008-09-16 James Demmel , Laura Grigori , Mark Hoemmen , Julien Langou

The current computer architecture has moved towards the multi/many-core structure. However, the algorithms in the current sequential dense numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multi/many-core…

Numerical Analysis · Computer Science 2013-03-14 Henricus Bouwmeester

In this paper, a hierarchical Tucker low-rank (HTLR) matrix is proposed to approximate non-oscillatory kernel functions in linear complexity. The HTLR matrix is based on the hierarchical matrix, with the low-rank blocks replaced by Tucker…

Numerical Analysis · Mathematics 2025-08-11 Yingzhou Li , Jingyu Liu

In this work, we present randomized compression algorithms for flat rank-structured matrices with shared bases, termed uniform Block Low-Rank (BLR) matrices. Our main contribution is a technique called tagging, which improves upon the…

Numerical Analysis · Mathematics 2025-12-16 Katherine J. Pearce , Anna Yesypenko , James Levitt , Per-Gunnar Martinsson

We consider the problem of computing a QR (or QZ) decomposition of a real, dense, tall and very skinny matrix. That is, the number of columns is tiny compared to the number of rows, rendering most computations completely or partially…

Mathematical Software · Computer Science 2026-03-24 Jonas Thies , Melven Röhrig-Zöllner

The efficient and accurate QR decomposition for matrices with hierarchical low-rank structures, such as HODLR and hierarchical matrices, has been challenging. Existing structure-exploiting algorithms are prone to numerical instability as…

Numerical Analysis · Mathematics 2018-09-28 Daniel Kressner , Ana Susnjara

In this work, we develop a fast hierarchical solver for solving large, sparse least squares problems. We build upon the algorithm, spaQR (sparsified QR), that was developed by the authors to solve large sparse linear systems. Our algorithm…

Numerical Analysis · Mathematics 2021-03-05 Abeynaya Gnanasekaran , Eric Darve

Factorizing large matrices by QR with column pivoting (QRCP) is substantially more expensive than QR without pivoting, owing to communication costs required for pivoting decisions. In contrast, randomized QRCP (RQRCP) algorithms have proven…

Numerical Analysis · Mathematics 2018-04-17 Jianwei Xiao , Ming Gu , Julien Langou

Scalable QR factorization algorithms for solving least squares and eigenvalue problems are critical given the increasing parallelism within modern machines. We introduce a more general parallelization of the CholeskyQR2 algorithm and show…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-18 Edward Hutter , Edgar Solomonik

This paper proposes a scalable binary CUR low-rank approximation algorithm that leverages parallel selection of representative rows and columns within a deterministic framework. By employing a blockwise adaptive cross approximation…

Numerical Analysis · Mathematics 2025-03-05 Bowen Su
‹ Prev 1 2 3 10 Next ›