English
Related papers

Related papers: A 3D Parallel Algorithm for QR Decomposition

200 papers

The QR Decomposition (QRD) of communication channel matrices is a fundamental prerequisite to several detection schemes in Multiple-Input Multiple-Output (MIMO) communication systems. Herein, the main feature of the QRD is to transform the…

Other Computer Science · Computer Science 2016-11-17 Sebastien Aubert , Manar Mohaisen , Fabienne Nouvel , KyungHi Chang

This paper presents a reexamination of the research paper titled "Communication-Avoiding Parallel Algorithms for \proc{TRSM}" by Wicky et al. We focus on the communication bandwidth cost analysis presented in the original work and identify…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-02 Yuan Tang

In this article, we focus on the parallel communication cost of multiplying the same vector along two modes of a $3$-dimensional symmetric tensor. This is a key computation in the higher-order power method for determining eigenpairs of a…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-19 Hussam Al Daas , Grey Ballard , Laura Grigori , Suraj Kumar , Kathryn Rouse , Mathieu Vérité

Numerical algorithms have two kinds of costs: arithmetic and communication, by which we mean either moving data between levels of a memory hierarchy (in the sequential case) or over a network connecting processors (in the parallel case).…

Numerical Analysis · Computer Science 2011-02-02 Grey Ballard , James Demmel , Olga Holtz , Oded Schwartz

Many large-scale scientific computations require eigenvalue solvers in a scaling regime where efficiency is limited by data movement. We introduce a parallel algorithm for computing the eigenvalues of a dense symmetric matrix, which…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-04-19 Edgar Solomonik , Grey Ballard , James Demmel , Torsten Hoefler

An efficient decoding algorithm named `divided decoder' is proposed in this paper. Divided decoding can be combined with any decoder using QR-decomposition and offers different pairs of performance and complexity. Divided decoding provides…

Information Theory · Computer Science 2009-01-23 In Sook Park

We investigate iterative low-resolution message-passing algorithms for quasi-cyclic LDPC codes with horizontal and vertical layered schedules. Coarse quantization and layered scheduling are highly relevant for hardware implementations to…

Information Theory · Computer Science 2022-12-19 Philipp Mohr , Gerhard Bauch

This paper introduces fast R updating algorithms specifically designed for statistical applications, including regression, filtering, and model selection, where data structures change frequently. Although traditional QR decomposition is…

Methodology · Statistics 2026-03-09 Mauro Bernardi , Claudio Busatto , Manuela Cattelan

Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and…

Numerical Analysis · Mathematics 2010-11-16 Grey Ballard , James Demmel , Ioana Dumitriu

We address the problem of performing message-passing-based decoding of quantum LDPC codes under hardware latency limitations. We propose a novel way to do layered decoding that suits quantum constraints and outperforms flooded scheduling,…

Quantum Physics · Physics 2023-08-28 Julien Du Crest , Francisco Garcia-Herrero , Mehdi Mhalla , Valentin Savin , Javier Valls

Scalable QR factorization algorithms for solving least squares and eigenvalue problems are critical given the increasing parallelism within modern machines. We introduce a more general parallelization of the CholeskyQR2 algorithm and show…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-18 Edward Hutter , Edgar Solomonik

In this paper we study the tradeoff between parallelism and communication cost in a map-reduce computation. For any problem that is not "embarrassingly parallel," the finer we partition the work of the reducers so that more parallelism can…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-06-21 Foto N. Afrati , Anish Das Sarma , Semih Salihoglu , Jeffrey D. Ullman

Communication-avoiding algorithms allow redundant computations to minimize the number of inter-process communications. In this paper, we propose to exploit this redundancy for fault-tolerance purpose. We illustrate this idea with QR…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-03 Camille Coti

Detection algorithms for multiple-input multiple-output (MIMO) wireless systems based on orthogonal frequency-division multiplexing (OFDM) typically require the computation of a QR decomposition for each of the data-carrying OFDM tones. The…

Information Theory · Computer Science 2009-10-30 Davide Cescato , Helmut Bölcskei

We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform, and just as stable as Householder QR. We prove optimality by extending…

Numerical Analysis · Mathematics 2008-08-21 James Demmel , Laura Grigori , Mark Hoemmen , Julien Langou

On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to keep the communication…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-12-10 Qinmeng Zou , Frederic Magoules

We propose a novel algorithm for using Hopfield networks to denoise QR codes. Hopfield networks have mostly been used as a noise tolerant memory or to solve difficult combinatorial problems. One of the major drawbacks in their use in noise…

Computer Vision and Pattern Recognition · Computer Science 2018-12-14 Ishan Bhatnagar , Shubhang Bhatnagar

While the proper orthogonal decomposition (POD) is optimal under certain norms it's also expensive to compute. For large matrix sizes, it is well known that the QR decomposition provides a tractable alternative. Under the assumption that it…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-12-18 Harbir Antil , Dangxing Chen , Scott E. Field

We present parallel and sequential dense QR factorization algorithms for tall and skinny matrices and general rectangular matrices that both minimize communication, and are as stable as Householder QR. The sequential and parallel algorithms…

Numerical Analysis · Mathematics 2008-09-16 James Demmel , Laura Grigori , Mark Hoemmen , Julien Langou

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Numerical Analysis · Mathematics 2008-08-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra
‹ Prev 1 2 3 10 Next ›