Related papers: Compressed Modular Matrix Multiplication

Simultaneous Modular Reduction and Kronecker Substitution for Small Finite Fields

We present algorithms to perform modular polynomial multiplication or modular dot product efficiently in a single machine word. We pack polynomials into integers and perform several modular operations with machine integer or floating point…

Symbolic Computation · Computer Science 2013-06-19 Jean-Guillaume Dumas , Laurent Fousse , Bruno Salvy

Data Compression with Prime Numbers

A compression algorithm is presented that uses the set of prime numbers. Sequences of numbers are correlated with the prime numbers, and labeled with the integers. The algorithm can be iterated on data sets, generating factors of doubles on…

General Physics · Physics 2007-05-23 Gordon Chalmers

Compressed Matrix Computations

Frugal computing is becoming an important topic for environmental reasons. In this context, several techniques have been proposed to reduce the storage of scientific data by dedicated compression methods specially tailored for arrays of…

Data Structures and Algorithms · Computer Science 2022-03-01 Matthieu Martel

Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices

As nowadays Machine Learning (ML) techniques are generating huge data collections, the problem of how to efficiently engineer their storage and operations is becoming of paramount importance. In this article we propose a new lossless…

Data Structures and Algorithms · Computer Science 2022-03-31 Paolo Ferragina , Travis Gagie , Dominik Köppl , Giovanni Manzini , Gonzalo Navarro , Manuel Striani , Francesco Tosoni

Multiword matrix multiplication over large finite fields in floating-point arithmetic

This article is concerned with the efficient computation of modular matrix multiplication C=AB mod p, a key kernel in computer algebra. We focus on floating-point arithmetic, which allows for using efficient matrix multiplication libraries.…

Numerical Analysis · Mathematics 2026-02-05 Jérémy Berthomieu , Stef Graillat , Dimitri Lesnoff , Theo Mary

Near Quadratic Matrix Multiplication Modulo Composites

We show how one can use non-prime-power, composite moduli for computing representations of the product of two $n\times n$ matrices using only $n^{2+o(1)}$ multiplications.

Computational Complexity · Computer Science 2007-05-23 Vince Grolmusz

Distributed Matrix Multiplication with a Smaller Recovery Threshold through Modulo-based Approaches

This paper considers the problem of calculating the matrix multiplication of two massive matrices $\mathbf{A}$ and $\mathbf{B}$ distributedly. We provide a modulo technique that can be applied to coded distributed matrix multiplication…

Information Theory · Computer Science 2023-09-20 Zhiquan Tan , Dingli Yuan , Zihao Wang , Zhongyi Huang

Common Subexpression-based Compression and Multiplication of Sparse Constant Matrices

In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on…

Machine Learning · Computer Science 2023-03-29 Emre Bilgili , Arda Yurdakul

Efficient approximations of matrix multiplication using truncated decompositions

We exploit the truncated singular value decomposition and the recently proposed circulant decomposition for an efficient first-order approximation of the multiplication of large dense matrices. A decomposition of each matrix into a sum of a…

Numerical Analysis · Mathematics 2026-04-27 Suvendu Kar , Hariprasad M. , Sai Gowri J. N. , Murugesan Venkatapathi

Computable Compressed Matrices

The biggest cost of computing with large matrices in any modern computer is related to memory latency and bandwidth. The average latency of modern RAM reads is 150 times greater than a clock step of the processor. Throughput is a little…

Data Structures and Algorithms · Computer Science 2013-03-04 Crysttian Arantes Paixão , Flávio Codeço Coelho

Floating Point Compression of Hierarchical Matrix Formats and its Impact on Matrix-Vector Multiplication

Matrix-vector multiplication forms the basis of many iterative solution algorithms and as such is an important algorithm also for hierarchical matrices which are used to represent dense data in an optimized form by applying low-rank…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-30 Ronald Kriemann

On the Inversion Modulo a Power of an Integer

Recently, Ko\c{c} proposed a neat and efficient algorithm for computing \[ x = a^{-1} \pmod {p^k} \] for a prime $p$ based on the exact solution of linear equations using $p$-adic expansions. The algorithm requires only addition and right…

Data Structures and Algorithms · Computer Science 2026-03-13 Guangwu Xu , Yunxiao Tian , Bingxin Yang

Superposition of many models into one

We present a method for storing multiple models within a single set of parameters. Models can coexist in superposition and still be retrieved individually. In experiments with neural networks, we show that a surprisingly large number of…

Machine Learning · Computer Science 2019-06-18 Brian Cheung , Alex Terekhov , Yubei Chen , Pulkit Agrawal , Bruno Olshausen

Modular SIMD arithmetic in Mathemagix

Modular integer arithmetic occurs in many algorithms for computer algebra, cryptography, and error correcting codes. Although recent microprocessors typically offer a wide range of highly optimized arithmetic functions, modular integer…

Mathematical Software · Computer Science 2014-07-15 Joris van der Hoeven , Grégoire Lecerf , Guillaume Quintin

Modular Multiplication without Carry Propagation (Algorithm Description)

This paper describes a sufficiently simple modular multiplication algorithm, which uses only carry-save addition with bit inspection Boolean logic and without number comparison or carry propagation.

Data Structures and Algorithms · Computer Science 2022-08-01 Oleg Mazonka

Memory-efficient compression of $\mathcal{DH}^2$-matrices for high-frequency problems

Directional interpolation is a fast and efficient compression technique for high-frequency Helmholtz boundary integral equations, but it requires a very large amount of storage in its original form. Algebraic recompression can significantly…

Numerical Analysis · Mathematics 2023-10-23 Steffen Börm , Janne Henningsen

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

We study algorithms for the fast computation of modular inverses. Newton-Raphson iteration over $p$-adic numbers gives a recurrence relation computing modular inverse modulo $p^m$, that is logarithmic in $m$. We solve the recurrence to…

Symbolic Computation · Computer Science 2019-04-22 Jean-Guillaume Dumas

Compressing Word Embeddings

Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic. However, these vector space representations (created through large-scale…

Computation and Language · Computer Science 2016-05-17 Martin Andrews

Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition

Word-embeddings are vital components of Natural Language Processing (NLP) models and have been extensively explored. However, they consume a lot of memory which poses a challenge for edge deployment. Embedding matrices, typically, contain…

Computation and Language · Computer Science 2020-11-12 Vasileios Lioutas , Ahmad Rashid , Krtin Kumar , Md Akmal Haidar , Mehdi Rezagholizadeh

Operand Folding Hardware Multipliers

This paper describes a new accumulate-and-add multiplication algorithm. The method partitions one of the operands and re-combines the results of computations done with each of the partitions. The resulting design turns-out to be both…

Mathematical Software · Computer Science 2011-04-11 Byungchun Chung , Sandra Marcello , Amir-Pasha Mirbaha , David Naccache , Karim Sabeg