Related papers: Parallel computation of echelon forms

A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Mathematical Software · Computer Science 2008-06-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra

Tiled Algorithms for Matrix Computations on Multicore Architectures

The current computer architecture has moved towards the multi/many-core structure. However, the algorithms in the current sequential dense numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multi/many-core…

Numerical Analysis · Computer Science 2013-03-14 Henricus Bouwmeester

A Framework for Practical Parallel Fast Matrix Multiplication

Matrix multiplication is a fundamental computation in many scientific disciplines. In this paper, we show that novel fast matrix multiplication algorithms can significantly outperform vendor implementations of the classical algorithm and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-08 Austin R. Benson , Grey Ballard

Parallel Tiled QR Factorization for Multicore Architectures

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Numerical Analysis · Mathematics 2008-08-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra

Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures

The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile algorithms, has recently been introduced. Previous research…

Mathematical Software · Computer Science 2010-02-23 Emmanuel Agullo , Henricus Bouwmeester , Jack Dongarra , Jakub Kurzak , Julien Langou , Lee Rosenberg

A parallel algorithm for Gaussian elimination over finite fields

In this paper we describe a parallel Gaussian elimination algorithm for matrices with entries in a finite field. Unlike previous approaches, our algorithm subdivides a very large input matrix into smaller submatrices by subdividing both…

Rings and Algebras · Mathematics 2018-06-13 Stephen Linton , Gabriele Nebe , Alice Niemeyer , Richard Parker , Jon Thackray

An inherently parallel H2-ULV factorization for solving dense linear systems on GPUs

Hierarchical low-rank approximation of dense matrices can reduce the complexity of their factorization from O(N^3) to O(N). However, the complex structure of such hierarchical matrices makes them difficult to parallelize. The block size and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-05 Qianxiang Ma , Rio Yokota

Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient

To design efficient parallel algorithms, some recent papers showed that many sequential iterative algorithms can be directly parallelized but there are still challenges in achieving work-efficiency and high-parallelism. Work-efficiency can…

Data Structures and Algorithms · Computer Science 2022-05-27 Zheqi Shen , Zijin Wan , Yan Gu , Yihan Sun

Parallel algorithms in linear algebra

This report provides an introduction to algorithms for fundamental linear algebra problems on various parallel computer architectures, with the emphasis on distributed-memory MIMD machines. To illustrate the basic concepts and key issues,…

Data Structures and Algorithms · Computer Science 2015-03-17 Richard P. Brent

A Parallel and Efficient Algorithm for Learning to Match

Many tasks in data mining and related fields can be formalized as matching between objects in two heterogeneous domains, including collaborative filtering, link prediction, image tagging, and web search. Machine learning techniques,…

Machine Learning · Computer Science 2014-10-24 Jingbo Shang , Tianqi Chen , Hang Li , Zhengdong Lu , Yong Yu

A Scalable Shared-Memory Parallel Simplex for Large-Scale Linear Programming

The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-29 Demetrios Coutinho , Felipe O. Lins e Silva , Daniel Aloise , Samuel , Xavier-de-Souza

Parallel Sparse and Data-Sparse Factorization-based Linear Solvers

Efficient solutions of large-scale, ill-conditioned and indefinite algebraic equations are ubiquitously needed in numerous computational fields, including multiphysics simulations, machine learning, and data science. Because of their…

Mathematical Software · Computer Science 2026-05-25 Xiaoye Sherry Li , Yang Liu

Parallel inversion of huge covariance matrices

An extremely common bottleneck encountered in statistical learning algorithms is inversion of huge covariance matrices, examples being in evaluating Gaussian likelihoods for a large number of data points. We propose general parallel…

Methodology · Statistics 2013-12-09 Anjishnu Banerjee , Joshua Vogelstein , David Dunson

A Parallel Task-based Approach to Linear Algebra

Processors with large numbers of cores are becoming commonplace. In order to take advantage of the available resources in these systems, the programming paradigm has to move towards increased parallelism. However, increasing the level of…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-10-07 Ashkan Tousimojarad , Wim Vanderbauwhede

U-match factorization: sparse homological algebra, lazy cycle representatives, and dualities in persistent (co)homology

Persistent homology is a leading tool in topological data analysis (TDA). Many problems in TDA can be solved via homological -- and indeed, linear -- algebra. However, matrices in this domain are typically large, with rows and columns…

Algebraic Topology · Mathematics 2021-08-23 Haibin Hang , Chad Giusti , Lori Ziegelmeier , Gregory Henselman-Petrusek

Efficient Pareto Manifold Learning with Low-Rank Structure

Multi-task learning, which optimizes performance across multiple tasks, is inherently a multi-objective optimization problem. Various algorithms are developed to provide discrete trade-off solutions on the Pareto front. Recently, continuous…

Machine Learning · Computer Science 2024-07-31 Weiyu Chen , James T. Kwok

Emulating a large memory with a collection of small ones

Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many…

Hardware Architecture · Computer Science 2015-11-17 James Hanlon

Parallel Enumeration of Triangulations

We report on the implementation of an algorithm for computing the set of all regular triangulations of finitely many points in Euclidean space. This algorithm, which we call down-flip reverse search, can be restricted, e.g., to computing…

Combinatorics · Mathematics 2018-10-30 Charles Jordan , Michael Joswig , Lars Kastner

LU factorization with errors *

We present new algorithms to detect and correct errors in the lower-upper factorization of a matrix, or the triangular linear system solution, over an arbitrary field. Our main algorithms do not require any additional information or…

Symbolic Computation · Computer Science 2019-01-31 Jean-Guillaume Dumas , Joris Van Der Hoeven , Clément Pernet , Daniel Roche

Partitioning Unstructured Sparse Tensor Algebra for Load-Balanced Parallel Execution

Sparse tensor algebra is challenging to efficiently parallelize due to the irregular, data-dependent, and potentially skewed structure of sparse computation. We propose the first partitioning algorithm that provably load balances the…

Programming Languages · Computer Science 2026-04-23 Atharva Chougule , Alexander J Root , Rubens Lacouture , Bobby Yan , Rohan Yadav , Fredrik Kjolstad