Related papers: Matrix-by-matrix multiplication algorithm with $O(…
Matrix multiplication is a fundamental classical computing operation whose efficiency becomes a major challenge at scale, especially for machine learning applications. Quantum computing, with its inherent parallelism and exponential storage…
It is widely known that the lower bound for the algorithmic complexity of square matrix multiplication resorts to at least $n^2$ arithmetic operations. The justification builds upon the following reasoning: given that there are $2 n^2$…
We show, for the input vectors $(a_0, a_1, ..., a_{n-1})$ and $(b_0, b_1, ..., b_{n-1})$, where $a_i$'s and $b_j$'s are real numbers, after $O(n\log^4 n)$ time preprocessing for each of them, the vector multiplication $(a_0, a_1, ...,…
Hierarchical matrices approximate a given matrix by a decomposition into low-rank submatrices that can be handled efficiently in factorized form. $\mathcal{H}^2$-matrices refine this representation following the ideas of fast multipole…
Algebraic matrix multiplication algorithms are designed by bounding the rank of matrix multiplication tensors, and then using a recursive method. However, designing algorithms in this way quickly leads to large constant factors: if one…
We give an $O(N\cdot \log N\cdot 2^{O(\log^*N)})$ algorithm for multiplying two $N$-bit integers that improves the $O(N\cdot \log N\cdot \log\log N)$ algorithm by Sch\"{o}nhage-Strassen. Both these algorithms use modular arithmetic.…
Let {\alpha} be the maximal value such that the product of an n x n^{\alpha} matrix by an n^{\alpha} x n matrix can be computed with n^{2+o(1)} arithmetic operations. In this paper we show that \alpha>0.30298, which improves the previous…
In this paper, we present algorithms to solve matrix multiplication problems in the MPC model. In particular, we consider the problem under various processor/memory constraints in the MPC model and prove the following results. 1.…
Matrix multiplication is a fundamental kernel in high performance computing. Many algorithms for fast matrix multiplication can only be applied to enormous matrices ($n>10^{100}$) and thus cannot be used in practice. Of all algorithms…
There have been several algorithms designed to optimise matrix multiplication. From schoolbook method with complexity $O(n^3)$ to advanced tensor-based tools with time complexity $O(n^{2.3728639})$ (lowest possible bound achieved), a lot of…
It is known that the multiplication of an $N \times M$ matrix with an $M \times P$ matrix can be performed using fewer multiplications than what the naive $NMP$ approach suggests. The most famous instance of this is Strassen's algorithm for…
After Strassen presented the first sub-cubic matrix multiplication algorithm, many Strassen-like algorithms are presented. Most of them with low asymptotic cost have large hidden leading coefficient which are thus impractical. To reduce the…
We describe two algorithms for multiplying n x n matrices using time and energy n^2 polylog(n) under basic models of classical physics. The first algorithm is for multiplying integer-valued matrices, and the second, quite different…
The Strassen algorithm and Winograd's variant accelerate matrix multiplication by using fewer arithmetic operations than standard matrix multiplication. Although many papers have been published to accelerate single- as well as…
Linear-scaling electronic-structure techniques, also called O(N) techniques, rely heavily on the multiplication of sparse matrices, where the sparsity arises from spatial cut-offs. In order to treat very large systems, the calculations must…
The quest for non-commutative matrix multiplication algorithms in small dimensions has seen a lot of recent improvements recently. In particular, the number of scalar multiplications required to multiply two $4\times4$ matrices was first…
Karppa & Kaski (2019) proposed a novel ``broken" or ``opportunistic" matrix multiplication algorithm, based on a variant of Strassen's algorithm, and used this to develop new algorithms for Boolean matrix multiplication, among other tasks.…
In calculating integral or discrete transforms, use has been made of fast algorithms for multiplying vectors by matrices whose elements are specified as values of special (Chebyshev, Legendre, Laguerre, etc.) functions. The currently…
We show that the product of an nx3 matrix and a 3x3 matrix over a commutative ring can be computed using 6n+3 multiplications. For two 3x3 matrices this gives us an algorithm using 21 multiplications. This is an improvement with respect to…
This paper deals with circulant matrices. It is shown that a circulant matrix can be multiplied by a vector in time O(n log(n)) in a ring with roots of unity without making use of an FFT algorithm. With our algorithm we achieve a speedup of…