Related papers: Performance Modeling for Dense Linear Algebra

Cache-aware Performance Modeling and Prediction for Dense Linear Algebra

Countless applications cast their computational core in terms of dense linear algebra operations. These operations can usually be implemented by combining the routines offered by standard linear algebra libraries such as BLAS and LAPACK,…

Performance · Computer Science 2014-10-01 Elmar Peise , Paolo Bientinesi

Performance Modeling and Prediction for Dense Linear Algebra

This dissertation introduces measurement-based performance modeling and prediction techniques for dense linear algebra algorithms. As a core principle, these techniques avoid executions of such algorithms entirely, and instead predict their…

Performance · Computer Science 2017-06-06 Elmar Peise

Hierarchical Performance Modeling for Ranking Dense Linear Algebra Algorithms

A large class of dense linear algebra operations, such as LU decomposition or inversion of a triangular matrix, are usually performed by blocked algorithms. For one such operation, typically, not only one but many algorithmic variants…

Performance · Computer Science 2012-08-28 Elmar Peise

Algorithm 979: Recursive Algorithms for Dense Linear Algebra -- The ReLAPACK Collection

To exploit both memory locality and the full performance potential of highly tuned kernels, dense linear algebra libraries such as LAPACK commonly implement operations as blocked algorithms. However, to achieve next-to-optimal performance…

Mathematical Software · Computer Science 2022-04-08 Elmar Peise , Paolo Bientinesi

Deriving Correct High-Performance Algorithms

Dijkstra observed that verifying correctness of a program is difficult and conjectured that derivation of a program hand-in-hand with its proof of correctness was the answer. We illustrate this goal-oriented approach by applying it to the…

Mathematical Software · Computer Science 2017-10-13 Devangi N. Parikh , Maggie E. Myers , Robert A. van de Geijn

The ELAPS Framework: Experimental Linear Algebra Performance Studies

Optimal use of computing resources requires extensive coding, tuning and benchmarking. To boost developer productivity in these time consuming tasks, we introduce the Experimental Linear Algebra Performance Studies framework (ELAPS), a…

Performance · Computer Science 2015-05-01 Elmar Peise , Paolo Bientinesi

LAGraph: Linear Algebra, Network Analysis Libraries, and the Study of Graph Algorithms

Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph…

Mathematical Software · Computer Science 2021-04-06 Gábor Szárnyas , David A. Bader , Timothy A. Davis , James Kitchen , Timothy G. Mattson , Scott McMillan , Erik Welch

Dense Linear Algebra over Finite Fields: the FFLAS and FFPACK packages

In the past two decades, some major efforts have been made to reduce exact (e.g. integer, rational, polynomial) linear algebra problems to matrix multiplication in order to provide algorithms with optimal asymptotic complexity. To provide…

Symbolic Computation · Computer Science 2009-01-14 Jean-Guillaume Dumas , Pascal Giorgi , Clément Pernet

A model-driven approach for a new generation of adaptive libraries

Efficient high-performance libraries often expose multiple tunable parameters to provide highly optimized routines. These can range from simple loop unroll factors or vector sizes all the way to algorithmic changes, given that some…

Performance · Computer Science 2022-02-22 Marco Cianfriglia , Flavio Vella , Cedric Nugteren , Anton Lokhmotov , Grigori Fursin

Program Generation for Linear Algebra Using Multiple Layers of DSLs

Numerical software in computational science and engineering often relies on highly-optimized building blocks from libraries such as BLAS and LAPACK, and while such libraries provide portable performance for a wide range of computing…

Mathematical Software · Computer Science 2019-06-21 Daniele G. Spampinato , Diego Fabregat-Traver , Markus Püschel , Paolo Bientinesi

A Study on the Influence of Caching: Sequences of Dense Linear Algebra Kernels

It is universally known that caching is critical to attain high- performance implementations: In many situations, data locality (in space and time) plays a bigger role than optimizing the (number of) arithmetic floating point operations. In…

Mathematical Software · Computer Science 2014-02-25 Elmar Peise , Paolo Bientinesi

Towards scalable pattern-based optimization for dense linear algebra

Linear algebraic expressions are the essence of many computationally intensive problems, including scientific simulations and machine learning applications. However, translating high-level formulations of these expressions to efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-22 Dániel Berényi , András Leitereg , Gábor Lehel

Multi-Threaded Dense Linear Algebra Libraries for Low-Power Asymmetric Multicore Processors

Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the BLAS (and LAPACK)…

Mathematical Software · Computer Science 2015-11-09 Sandra Catalán , José R. Herrero , Francisco D. Igual , Rafael Rodríguez-Sánchez , Enrique S. Quintana-Ortí

A New Efficient Algorithm for Construction of LLS Models

We present a new efficient algortithm for construction of linear latent structure (LLS) models. This algorithm reduces a problem of estimation of model parameters to a sequence of problems of linear algebra, which assures a low…

Probability · Mathematics 2007-06-13 Mikhail Kovtun , Igor Akushevich , Kenneth G. Manton , H. Dennis Tolley

Deinsum: Practically I/O Optimal Multilinear Algebra

Multilinear algebra kernel performance on modern massively-parallel systems is determined mainly by data movement. However, deriving data movement-optimal distributed schedules for programs with many high-dimensional inputs is a notoriously…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-17 Alexandros Nikolaos Ziogas , Grzegorz Kwasniewski , Tal Ben-Nun , Timo Schneider , Torsten Hoefler

Ranking Algorithms by Performance

A common way of doing algorithm selection is to train a machine learning model and predict the best algorithm from a portfolio to solve a particular problem. While this method has been highly successful, choosing only a single algorithm has…

Artificial Intelligence · Computer Science 2013-11-19 Lars Kotthoff

Towards black-box parameter estimation

Deep learning algorithms have recently shown to be a successful tool in estimating parameters of statistical models for which simulation is easy, but likelihood computation is challenging. But the success of these approaches depends on…

Machine Learning · Statistics 2024-02-20 Amanda Lenzi , Haavard Rue

Personalizing Performance Regression Models to Black-Box Optimization Problems

Accurately predicting the performance of different optimization algorithms for previously unseen problem instances is crucial for high-performing algorithm selection and configuration techniques. In the context of numerical optimization,…

Neural and Evolutionary Computing · Computer Science 2021-04-23 Tome Eftimov , Anja Jankovic , Gorjan Popovski , Carola Doerr , Peter Korošec

Co-Design of the Dense Linear AlgebravSoftware Stack for Multicore Processors

This paper advocates for an intertwined design of the dense linear algebra software stack that breaks down the strict barriers between the high-level, blocked algorithms in LAPACK (Linear Algebra PACKage) and the low-level,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-01 Héctor Martínez , Sandra Catalán , Francisco D. Igual , José R. Herrero , Rafael Rodríguez-Sánchez , Enrique S. Quintana-Ortí

Automating the Last-Mile for High Performance Dense Linear Algebra

High performance dense linear algebra (DLA) libraries often rely on a general matrix multiply (Gemm) kernel that is implemented using assembly or with vector intrinsics. In particular, the real-valued Gemm kernels provide the overwhelming…

Mathematical Software · Computer Science 2017-05-01 Richard Michael Veras , Tze Meng Low , Tyler Michael Smith , Robert van de Geijn , Franz Franchetti