Related papers: Rectangular Full Packed Format for Cholesky's Algo…

Parallelization and scalability analysis of inverse factorization using the Chunks and Tasks programming model

We present three methods for distributed memory parallel inverse factorization of block-sparse Hermitian positive definite matrices. The three methods are a recursive variant of the AINV inverse Cholesky algorithm, iterative refinement, and…

Numerical Analysis · Mathematics 2024-12-20 Anton G. Artemov , Elias Rudberg , Emanuel H. Rubensson

Rank and run-time aware compression of NLP Applications

Sequence model based NLP applications can be large. Yet, many applications that benefit from them run on small devices with very limited compute and storage capabilities, while still having run-time constraints. As a result, there is a need…

Computation and Language · Computer Science 2020-10-08 Urmish Thakker , Jesse Beu , Dibakar Gope , Ganesh Dasika , Matthew Mattina

Exact Matrix Factorization Updates for Nonlinear Programming

LU and Cholesky matrix factorization algorithms are core subroutines used to solve systems of linear equations (SLEs) encountered while solving an optimization problem. Standard factorization algorithms are highly efficient but remain…

Numerical Analysis · Mathematics 2022-07-25 Adolfo R. Escobedo

Parallel Cholesky Factorization for Banded Matrices using OpenMP Tasks

Cholesky factorization is a widely used method for solving linear systems involving symmetric, positive-definite matrices, and can be an attractive choice in applications where a high degree of numerical stability is needed. One such…

Numerical Analysis · Mathematics 2023-05-09 Felix Liu , Albin Fredriksson , Stefano Markidis

LAW: A Tool for Improved Productivity with High-Performance Linear Algebra Codes. Design and Applications

LAPACK and ScaLAPACK are arguably the defacto standard libraries among the scientific community for solving linear algebra problems on sequential, shared-memory and distributed-memory architectures. While ease of use was a major design goal…

Computational Physics · Physics 2007-10-26 Timothy Stitt , Graham Kells , Jiri Vala

Some new techniques to use in serial sparse Cholesky factorization algorithms

We present a new variant of serial right-looking supernodal sparse Cholesky factorization (RL). Our comparison of RL with the multifrontal method confirms that RL is simpler, slightly faster, and requires slightly less storage. The key to…

Mathematical Software · Computer Science 2024-09-23 M. Ozan Karsavuran , Esmond G. Ng , Barry W. Peyton , Jonathan L. Peyton

P3DFFT: a framework for parallel computations of Fourier transforms in three dimensions

Fourier and related transforms is a family of algorithms widely employed in diverse areas of computational science, notoriously difficult to scale on high-performance parallel computers with large number of processing elements (cores). This…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-09 Dmitry Pekurovsky

Compressed Matrix Computations

Frugal computing is becoming an important topic for environmental reasons. In this context, several techniques have been proposed to reduce the storage of scientific data by dedicated compression methods specially tailored for arrays of…

Data Structures and Algorithms · Computer Science 2022-03-01 Matthieu Martel

Compressed Vertical Partitioning for Full-In-Memory RDF Management

The Web of Data has been gaining momentum and this leads to increasingly publish more semi-structured datasets following the RDF model, based on atomic triple units of subject, predicate, and object. Although it is a simple model,…

Databases · Computer Science 2013-10-22 Sandra Álvarez-García , Nieves R. Brisaboa , Javier D. Fernández , Miguel A. Martínez-Prieto , Gonzalo Navarro

Programming Parallel Dense Matrix Factorizations with Look-Ahead and OpenMP

We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multithreaded version of BLAS. This…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-04-20 Sandra Catalán , Adrián Castelló , Francisco D. Igual , Rafael Rodríguez-Sánchez , Enrique S. Quintana-Ortí

An Asynchronous Task-based Fan-Both Sparse Cholesky Solver

Systems of linear equations arise at the heart of many scientific and engineering applications. Many of these linear systems are sparse; i.e., most of the elements in the coefficient matrix are zero. Direct methods based on matrix…

Mathematical Software · Computer Science 2016-08-24 Mathias Jacquelin , Yili Zheng , Esmond Ng , Katherine Yelick

Performant Tridiagonal Factorization of Skew-Symmetric Matrices

The factorization of skew-symmetric matrices is a critically understudied area of dense linear algebra, particularly in comparison to that of general and symmetric matrices. While some algorithms can be adapted from the symmetric case, the…

Mathematical Software · Computer Science 2026-05-06 Ishna Satyarth , Chao Yin , Devin A. Matthews , Maggie Myers , Robert van de Geijn , RuQing G. Xu

Asymmetric Multiresolution Matrix Factorization

Multiresolution Matrix Factorization (MMF) was recently introduced as an alternative to the dominant low-rank paradigm in order to capture structure in matrices at multiple different scales. Using ideas from multiresolution analysis (MRA),…

Numerical Analysis · Mathematics 2019-10-14 Pramod Kaushik Mudrakarta , Shubhendu Trivedi , Risi Kondor

A Transprecision Floating-Point Platform for Ultra-Low Power Computing

In modern low-power embedded platforms, floating-point (FP) operations emerge as a major contributor to the energy consumption of compute-intensive applications with large dynamic range. Experimental evidence shows that 50% of the energy…

Hardware Architecture · Computer Science 2017-11-29 Giuseppe Tagliavini , Stefan Mach , Davide Rossi , Andrea Marongiu , Luca Benini

Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block…

Mathematical Software · Computer Science 2016-01-26 Kyungjoo Kim , Sivasankaran Rajamanickam , George Stelle , H. Carter Edwards , Stephen L. Olivier

Federated Matrix Factorization: Algorithm Design and Application to Data Clustering

Recent demands on data privacy have called for federated learning (FL) as a new distributed learning paradigm in massive and heterogeneous networks. Although many FL algorithms have been proposed, few of them have considered the matrix…

Machine Learning · Computer Science 2020-11-02 Shuai Wang , Tsung-Hui Chang

Differentiation of the Cholesky decomposition

We review strategies for differentiating matrix-based computations, and derive symbolic and algorithmic update rules for differentiating expressions containing the Cholesky decomposition. We recommend new `blocked' algorithms, based on…

Computation · Statistics 2016-02-25 Iain Murray

Accelerated Parallel and Distributed Algorithm using Limited Internal Memory for Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) is a powerful technique for dimension reduction, extracting latent factors and learning part-based representation. For large datasets, NMF performance depends on some major issues: fast algorithms,…

Optimization and Control · Mathematics 2015-07-01 Duy-Khuong Nguyen , Tu-Bao Ho

Recovery of damped exponentials using structured low rank matrix completion

We introduce a structured low rank matrix completion algorithm to recover a series of images from their under-sampled measurements, where the signal along the parameter dimension at every pixel is described by a linear combination of…

Computer Vision and Pattern Recognition · Computer Science 2017-07-13 Arvind Balachandrasekaran , Vincent Magnotta , Mathews Jacob

Advancing Matrix Completion by Modeling Extra Structures beyond Low-Rankness

A well-known method for completing low-rank matrices based on convex optimization has been established by Cand{\`e}s and Recht. Although theoretically complete, the method may not entirely solve the low-rank matrix completion problem. This…

Methodology · Statistics 2014-07-17 Guangcan Liu , Ping Li