English
Related papers

Related papers: Implementing Strassen's Algorithm with BLIS

200 papers

Matrix multiplication is a cornerstone operation in a wide array of scientific fields, including machine learning and computer graphics. The standard algorithm for matrix multiplication has a complexity of $\mathcal{O}(n^3)$ for $n\times n$…

Hardware Architecture · Computer Science 2024-06-05 Afzal Ahmad , Linfeng Du , Wei Zhang

Matrix multiplication (GEMM) is a core operation to numerous scientific applications. Traditional implementations of Strassen-like fast matrix multiplication (FMM) algorithms often do not perform well except for very large matrix sizes, due…

Mathematical Software · Computer Science 2016-11-04 Jianyu Huang , Leslie Rice , Devin A. Matthews , Robert A. van de Geijn

Matrix multiplication is a fundamental computation in many scientific disciplines. In this paper, we show that novel fast matrix multiplication algorithms can significantly outperform vendor implementations of the classical algorithm and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-08 Austin R. Benson , Grey Ballard

Matrix libraries often focus on achieving high performance for problems considered to be either "small" or "large", as these two scenarios tend to respond best to different optimization strategies. We propose a unified technique for…

Mathematical Software · Computer Science 2023-02-20 RuQing G. Xu , Field G. Van Zee , Robert A. van de Geijn

Conventional GPU implementations of Strassen's algorithm (Strassen) typically rely on the existing high-performance matrix multiplication (GEMM), trading space for time. As a result, such approaches can only achieve practical speedup for…

Mathematical Software · Computer Science 2018-08-27 Jianyu Huang , Chenhan D. Yu , Robert A. van de Geijn

While Strassen's matrix multiplication algorithm reduces the complexity of naive matrix multiplication, general-purpose hardware is not suitable for achieving the algorithm's promised theoretical speedups. This leaves the question of if it…

Hardware Architecture · Computer Science 2025-02-17 Trevor E. Pogue , Nicola Nicolici

Matrix-matrix multiplication is a fundamental operation of great importance to scientific computing and, increasingly, machine learning. It is a simple enough concept to be introduced in a typical high school algebra course yet in practice…

Mathematical Software · Computer Science 2016-09-02 Jianyu Huang , Robert A. van de Geijn

Recently, reinforcement algorithms discovered new algorithms that really jump-started a wave of excitements and a flourishing of publications. However, there is little on implementations, applications, and, especially, no absolute…

Mathematical Software · Computer Science 2023-12-21 Paolo D'Alberto

We approach the problem of implementing mixed-datatype support within the general matrix multiplication (GEMM) operation of the BLIS framework, whereby each matrix operand A, B, and C may be stored as single- or double-precision real or…

Mathematical Software · Computer Science 2019-05-03 Field G. Van Zee , Devangi N. Parikh , Robert A. van de Geijn

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-10 Mehmet Deveci , Christian Trott , Sivasankaran Rajamanickam

Generalized sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. Here we show that SpGEMM also yields efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-19 Aydin Buluc , John Gilbert

Sparse generalized matrix-matrix multiplication (SpGEMM) is a fundamental operation for real-world network analysis. With the increasing size of real-world networks, the single-machine-based SpGEMM approach cannot perform SpGEMM on…

Data Structures and Algorithms · Computer Science 2023-08-29 Myung-Hwan Jang , Yunyong Ko , Hyuck-Moo Gwon , Ikhyeon Jo , Yongjun Park , Sang-Wook Kim

The devices designed for the Internet-of-Things encompass a large variety of distinct processor architectures, forming a highly heterogeneous zoo. In order to tackle this, we employ a simulator to estimate the performance of the…

Hardware Architecture · Computer Science 2024-03-13 Cristian Ramírez , Adrián Castelló , Héctor Martínez , Enrique S. Quintana-Ortí

Sparse matrix multiplication is traditionally performed in memory and scales to large matrices using the distributed memory of multiple nodes. In contrast, we scale sparse matrix multiplication beyond memory capacity by implementing sparse…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-15 Da Zheng , Disa Mhembere , Vince Lyzinski , Joshua Vogelstein , Carey E. Priebe , Randal Burns

In computational science and data analytics, many workloads involve irregular and sparse computations that are inherently difficult to optimize for modern hardware. A key kernel is Sparse General Matrix-Matrix Multiplication (SpGEMM), which…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-22 Yifan Li , Giulia Guidi

Multiplying two sparse matrices (SpGEMM) is a common computational primitive used in many areas including graph algorithms, bioinformatics, algebraic multigrid solvers, and randomized sketching. Distributed-memory parallel algorithms for…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-28 Yuxi Hong , Aydin Buluc

In this study, we propose a simple method for fault-tolerant Strassen-like matrix multiplications. The proposed method is based on using two distinct Strassen-like algorithms instead of replicating a given one. We have realized that using…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-11 Osman B. Guney , Suayb S. Arslan

General matrix-matrix multiplications with double-precision real and complex entries (DGEMM and ZGEMM) in vendor-supplied BLAS libraries are best optimized for square matrices but often show bad performance for tall & skinny matrices, which…

Mathematical Software · Computer Science 2020-06-25 Dominik Ernst , Georg Hager , Jonas Thies , Gerhard Wellein

The Strassen algorithm and Winograd's variant accelerate matrix multiplication by using fewer arithmetic operations than standard matrix multiplication. Although many papers have been published to accelerate single- as well as…

Numerical Analysis · Mathematics 2015-10-27 Tomonori Kouya

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of…

Machine Learning · Computer Science 2018-06-11 Michael Tschannen , Aran Khanna , Anima Anandkumar
‹ Prev 1 2 3 10 Next ›