Related papers: A note on the O(n)-storage implementation of the G…
We describe a fast solver for linear systems with reconstructable Cauchy-like structure, which requires O(rn^2) floating point operations and O(rn) memory locations, where n is the size of the matrix and r its displacement rank. The solver…
Given a (possibly approximate) Cauchy matrix, how can we efficiently compute its generators? Expanding on previous work by Liesen and Luce [Linear Algebra Appl. 493 (2016) 261--280], we present a general family of algorithms for Cauchy…
In this paper we describe a parallel Gaussian elimination algorithm for matrices with entries in a finite field. Unlike previous approaches, our algorithm subdivides a very large input matrix into smaller submatrices by subdividing both…
Many matrices that arise in the solution of signal processing problems have a special displacement structure. For example, adaptive filtering and direction-of-arrival estimation yield matrices of Toeplitz type. A recent method of Gohberg,…
We propose an $O(N\cdot M)$ sorting algorithm by Machine Learning method, which shows a huge potential sorting big data. This sorting algorithm can be applied to parallel sorting and is suitable for GPU or TPU acceleration. Furthermore, we…
We introduce a quantum linear system solving algorithm based on the Kaczmarz method, a widely used workhorse for large linear systems and least-squares problems that updates the solution by enforcing one equation at a time. Its simplicity…
Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or tradeoff of such algorithms. We study modern…
We present a novel algorithm attaining excessively fast, the sought solution of linear systems of equations. The algorithm is short in its basic formulation and, by definition, vectorized, while the memory allocation demands are trivial,…
Linear-scaling electronic-structure techniques, also called O(N) techniques, rely heavily on the multiplication of sparse matrices, where the sparsity arises from spatial cut-offs. In order to treat very large systems, the calculations must…
We introduce a fast algorithm for entry-wise evaluation of the Gauss-Newton Hessian (GNH) matrix for the fully-connected feed-forward neural network. The algorithm has a precomputation step and a sampling step. While it generally requires…
Perturbing a deterministic $n$-dimensional matrix with small Gaussian noise is a cornerstone of smoothed analysis of algorithms [Spielman and Teng, JACM 2004], as it reduces the condition number of the input to $O(n)$, and with it the…
We propose a novel quantum algorithm for solving linear optimization problems by quantum-mechanical simulation of the central path. While interior point methods follow the central path with an iterative algorithm that works with successive…
We study matrix-matrix multiplication of two matrices, $A$ and $B$, each of size $n \times n$. This operation results in a matrix $C$ of size $n\times n$. Our goal is to produce $C$ as efficiently as possible given a cache: a 1-D limited…
Linear scaling methods, or O(N) methods, have computational and memory requirements which scale linearly with the number of atoms in the system, N, in contrast to standard approaches which scale with the cube of the number of atoms. These…
The fastest known algorithms for dealing with structured matrices, in the sense of the displacement rank measure, are randomized. For handling classical displacement structures, they achieve the complexity bounds…
We develop several efficient algorithms for the classical \emph{Matrix Scaling} problem, which is used in many diverse areas, from preconditioning linear systems to approximation of the permanent. On an input $n\times n$ matrix $A$, this…
Frigo et al. proposed an ideal cache model and a recursive technique to design sequential cache-efficient algorithms in a cache-oblivious fashion. Ballard et al. pointed out that it is a fundamental open problem to extend the technique to…
We propose a new (theoretical) computational model for the study of massive data processing with limited computational resources. Our model measures the complexity of reading the very large data sets in terms of the data size N and analyzes…
The large sparse linear systems arising from the finite element or finite difference discretization of elliptic PDEs can be solved directly via, e.g., nested dissection or multifrontal methods. Such techniques reorder the nodes in the grid…
Deep neural networks (DNNs) require very large amounts of computation both for training and for inference when deployed in the field. A common approach to implementing DNNs is to recast the most computationally expensive operations as…