Related papers: Automatically Harnessing Sparse Acceleration
The translation of linear algebra computations into efficient sequences of library calls is a non-trivial task that requires expertise in both linear algebra and high-performance computing. Almost all high-level languages and libraries for…
In this paper, we tackle the problem of automatically generating algorithms for linear algebra operations by taking advantage of problem-specific knowledge. In most situations, users possess much more information about the problem at hand…
The level of abstraction at which application experts reason about linear algebra computations and the level of abstraction used by developers of high-performance numerical linear algebra libraries do not match. The former is conveniently…
The standardization of an interface for dense linear algebra operations in the BLAS standard has enabled interoperability between different linear algebra libraries, thereby boosting the success of scientific computing, in particular in…
A major challenge in the deployment of scientific software solutions is the adaptation of research prototypes to production-grade code. While high-level languages like MATLAB are useful for rapid prototyping, they lack the resource…
In the past two decades, some major efforts have been made to reduce exact (e.g. integer, rational, polynomial) linear algebra problems to matrix multiplication in order to provide algorithms with optimal asymptotic complexity. To provide…
This dissertation focuses on the design and the implementation of domain-specific compilers for linear algebra matrix equations. The development of efficient libraries for such equations, which lie at the heart of most software for…
Fast numerical libraries have been a cornerstone of scientific computing for decades, but this comes at a price. Programs may be tied to vendor specific software ecosystems resulting in polluted, non-portable code. As we enter an era of…
Many areas of machine learning and science involve large linear algebra problems, such as eigendecompositions, solving linear systems, computing matrix exponentials, and trace estimation. The matrices involved often have Kronecker,…
Algebraic characterization of logic programs has received increasing attention in recent years. Researchers attempt to exploit connections between linear algebraic computation and symbolic computation in order to perform logical inference…
When implementing functionality which requires sparse matrices, there are numerous storage formats to choose from, each with advantages and disadvantages. To achieve good performance, several formats may need to be used in one program,…
Dedicated tensor accelerators demonstrate the importance of linear algebra in modern applications. Such accelerators have the potential for impressive performance gains, but require programmers to rewrite code using vendor APIs - a barrier…
Sparse matrices are an integral part of scientific simulations. As hardware evolves new sparse matrix storage formats are proposed aiming to exploit optimizations specific to the new hardware. In the era of heterogeneous computing, users…
Optimal use of computing resources requires extensive coding, tuning and benchmarking. To boost developer productivity in these time consuming tasks, we introduce the Experimental Linear Algebra Performance Studies framework (ELAPS), a…
LAPACK and ScaLAPACK are arguably the defacto standard libraries among the scientific community for solving linear algebra problems on sequential, shared-memory and distributed-memory architectures. While ease of use was a major design goal…
Most numerical solvers and libraries nowadays are implemented to use mathematical models created with language-specific built-in data types (e.g. real in Fortran or double in C) and their respective elementary algebra implementations.…
Software packages like TensorFlow and PyTorch are designed to support linear algebra operations, and their speed and usability determine their success. However, by prioritising speed, they often neglect memory requirements. As a…
Fine-tuning LLMs is both computationally and memory-intensive. While parameter-efficient fine-tuning methods, such as QLoRA and DoRA, reduce the number of trainable parameters and lower memory usage, they do not decrease computational cost.…
We present SLinGen, a program generation system for linear algebra. The input to SLinGen is an application expressed mathematically in a linear-algebra-inspired language (LA) that we define. LA provides basic scalar/vector/matrix…
Randomized numerical linear algebra - RandNLA, for short - concerns the use of randomization as a resource to develop improved algorithms for large-scale linear algebra computations. The origins of contemporary RandNLA lay in theoretical…