Related papers: A method for accelerating low precision operations…
We present an algorithm to reduce the computational effort for the multiplication of a given matrix with an unknown column vector. The algorithm decomposes the given matrix into a product of matrices whose entries are either zero or integer…
Machine learning (ML) models are widely used in many important domains. For efficiently processing these computational- and memory-intensive applications, tensors of these over-parameterized models are compressed by leveraging sparsity,…
We propose a penalized likelihood framework for estimating multiple precision matrices from different classes. Most existing methods either incorporate no information on relationships between the precision matrices, or require this…
This article presents a new method to compute matrices from numerical simulations based on the ideas of sparse sampling and compressed sensing. The method is useful for problems where the determination of the entries of a matrix constitutes…
The exponentially growing model size drives the continued success of deep learning, but it brings prohibitive computation and memory cost. From the algorithm perspective, model sparsification and quantization have been studied to alleviate…
This work focuses on accelerating the multiplication of a dense random matrix with a (fixed) sparse matrix, which is frequently used in sketching algorithms. We develop a novel scheme that takes advantage of blocking and recomputation…
We propose a method for strict error control in sparse approximate matrix-matrix multiplication. The method combines an error bound and a parameter sweep to select an appropriate threshold value. The scheme for error control and the sparse…
Sparse matrix multiplication is an important component of linear algebra computations. Implementing sparse matrix multiplication on an associative processor (AP) enables high level of parallelism, where a row of one matrix is multiplied in…
Low-precision computation is often used to lower the time and energy cost of machine learning, and recently hardware accelerators have been developed to support it. Still, it has been used primarily for inference - not training. Previous…
We apply a method recently introduced to the statistical literature to directly estimate the precision matrix from an ensemble of samples drawn from a corresponding Gaussian distribution. Motivated by the observation that cosmological…
Driven by the insatiable needs to process ever larger amount of data with more complex models, modern computer processors and accelerators are beginning to offer half precision floating point arithmetic support, and extremely optimized…
We present new algorithms to detect and correct errors in the product of two matrices, or the inverse of a matrix, over an arbitrary field. Our algorithms do not require any additional information or encoding other than the original inputs…
Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since…
In recent years, a new kind of accelerated hardware has gained popularity in the Artificial Intelligence (AI) and Machine Learning (ML) communities which enables extremely high-performance tensor contractions in reduced precision for deep…
Matrix completion constantly receives tremendous attention from many research fields. It is commonly applied for recommender systems such as movie ratings, computer vision such as image reconstruction or completion, multi-task learning such…
Recent hardware acceleration advances have enabled powerful specialized accelerators for finite element computations, spiking neural network inference, and sparse tensor operations. However, existing approaches face fundamental limitations:…
Sparse matrix multiplication is an important component of linear algebra computations. In this paper, an architecture based on Content Addressable Memory (CAM) and Resistive Content Addressable Memory (ReCAM) is proposed for accelerating…
As neural network model sizes have dramatically increased, so has the interest in various techniques to reduce their parameter counts and accelerate their execution. An active area of research in this field is sparsity - encouraging zero…
In this letter, we propose an algorithm for recovery of sparse and low rank components of matrices using an iterative method with adaptive thresholding. In each iteration, the low rank and sparse components are obtained using a thresholding…
Network pruning can reduce the high computation cost of deep neural network (DNN) models. However, to maintain their accuracies, sparse models often carry randomly-distributed weights, leading to irregular computations. Consequently, sparse…