Related papers: Engineering Boolean Matrix Multiplication for Mult…
This paper presents a quantum algorithm that computes the product of two $n\times n$ Boolean matrices in $\tilde O(n\sqrt{\ell}+\ell\sqrt{n})$ time, where $\ell$ is the number of non-zero entries in the product. This improves the previous…
Low-bit quantized neural networks are of great interest in practical applications because they significantly reduce the consumption of both memory and computational resources. Binary neural networks are memory and computationally efficient…
Karppa & Kaski (2019) proposed a novel ``broken" or ``opportunistic" matrix multiplication algorithm, based on a variant of Strassen's algorithm, and used this to develop new algorithms for Boolean matrix multiplication, among other tasks.…
We describe two algorithms for multiplying n x n matrices using time and energy n^2 polylog(n) under basic models of classical physics. The first algorithm is for multiplying integer-valued matrices, and the second, quite different…
We study the problem of computing matrix chain multiplications in a distributed computing cluster. In such systems, performance is often limited by the straggler problem, where the slowest worker dominates the overall computation latency.…
Matrix multiplication is a fundamental computation in many scientific disciplines. In this paper, we show that novel fast matrix multiplication algorithms can significantly outperform vendor implementations of the classical algorithm and…
The Strassen algorithm and Winograd's variant accelerate matrix multiplication by using fewer arithmetic operations than standard matrix multiplication. Although many papers have been published to accelerate single- as well as…
Multilevel/multigrid methods is one of the most popular approaches for solving a large sparse linear system of equations, typically, arising from the discretization of partial differential equations. One critical step in the…
Computationally efficient matrix multiplication is a fundamental requirement in various fields, including and particularly in data analytics. To do so, the computation task of a large-scale matrix multiplication is typically outsourced to…
While Strassen's matrix multiplication algorithm reduces the complexity of naive matrix multiplication, general-purpose hardware is not suitable for achieving the algorithm's promised theoretical speedups. This leaves the question of if it…
Sparse data structures are commonly used in neural networks to reduce the memory footprint. These data structures are compact but cause irregularities such as random memory accesses, which prevent efficient use of the memory hierarchy. GPUs…
We propose a non-commutative algorithm for multiplying 2x2 matrices using 7 coefficient products. This algorithm reaches simultaneously a better accuracy in practice compared to previously known such fast algorithms, and a time complexity…
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we…
Specialized function gradient computing hardware could greatly improve the performance of state-of-the-art optimization algorithms, e.g., based on gradient descent or conjugate gradient methods that are at the core of control, machine…
We present a non-commutative algorithm for the multiplication of a 2 x 2 block-matrix by its adjoint, defined by a matrix ring anti-homomorphism. This algorithm uses 5 block products (3 recursive calls and 2 general products)over C or in…
Boolean matrix factorisation aims to decompose a binary data matrix into an approximate Boolean product of two low rank, binary matrices: one containing meaningful patterns, the other quantifying how the observations can be expressed as a…
After Strassen presented the first sub-cubic matrix multiplication algorithm, many Strassen-like algorithms are presented. Most of them with low asymptotic cost have large hidden leading coefficient which are thus impractical. To reduce the…
Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous task in various engineering and scientific applications. However, inner product based SpGENN introduces redundant input fetches for mismatched nonzero operands, while…
The Boolean product $R = P \cdot Q$ of two $\{ 0, 1\} \; m \times m \; $ matrices is $$R(j,k) = 1 \; \mathrm{\ IF\ for\ some\ } \; t \; \,P(j, t) = Q(t, k) = 1\; \; \mathrm{ELSE\ } \, R(j, k) = 0. $$ The near-optimal design reduces the…
It is known that the multiplication of an $N \times M$ matrix with an $M \times P$ matrix can be performed using fewer multiplications than what the naive $NMP$ approach suggests. The most famous instance of this is Strassen's algorithm for…