Related papers: Recursive double-size fixed precision arithmetic

Combined Integer and Variable Precision (CIVP) Floating Point Multiplication Architecture for FPGAs

In this paper, we propose an architecture/methodology for making FPGAs suitable for integer as well as variable precision floating point multiplication. The proposed work will of great importance in applications which requires variable…

Hardware Architecture · Computer Science 2007-11-19 Himanshu Thapliyal , Hamid R. Arabnia , Rajnish Bajpai , Kamal K. Sharma

DPUV3INT8: A Compiler View to programmable FPGA Inference Engines

We have a FPGA design, we make it fast, efficient, and tested for a few important examples. Now we must infer a general solution to deploy in the data center. Here, we describe the FPGA DPUV3INT8 design and our compiler effort. The…

Computation and Language · Computer Science 2021-10-12 Paolo D'Alberto , Jiangsha Ma , Jintao Li , Yiming Hu , Manasa Bollavaram , Shaoxia Fang

Recursive Self-Aggregation Unlocks Deep Thinking in Large Language Models

Test-time scaling methods improve the capabilities of large language models (LLMs) by increasing the amount of compute used during inference to make a prediction. Inference-time compute can be scaled in parallel by choosing among multiple…

Machine Learning · Computer Science 2026-02-25 Siddarth Venkatraman , Vineet Jain , Sarthak Mittal , Vedant Shah , Johan Obando-Ceron , Yoshua Bengio , Brian R. Bartoldson , Bhavya Kailkhura , Guillaume Lajoie , Glen Berseth , Nikolay Malkin , Moksh Jain

Reversible circuit compilation with space constraints

We develop a framework for resource efficient compilation of higher-level programs into lower-level reversible circuits. Our main focus is on optimizing the memory footprint of the resulting reversible networks. This is motivated by the…

Quantum Physics · Physics 2015-10-02 Alex Parent , Martin Roetteler , Krysta M. Svore

Multiprecision Arithmetic for Cryptology in C++ - Compile-Time Computations and Beating the Performance of Hand-Optimized Assembly at Run-Time

We describe a new C++ library for multiprecision arithmetic for numbers in the order of 100--500 bits, i.e., representable with just a few limbs. The library is written in "optimizing-compiler-friendly" C++, with an emphasis on the use of…

Cryptography and Security · Computer Science 2018-04-20 Niek J. Bouman

SIRNN: A Math Library for Secure RNN Inference

Complex machine learning (ML) inference algorithms like recurrent neural networks (RNNs) use standard functions from math libraries like exponentiation, sigmoid, tanh, and reciprocal of square root. Although prior work on secure 2-party…

Cryptography and Security · Computer Science 2021-05-11 Deevashwer Rathee , Mayank Rathee , Rahul Kranti Kiran Goli , Divya Gupta , Rahul Sharma , Nishanth Chandran , Aseem Rastogi

Recursive function templates as a solution of linear algebra expressions in C++

The article deals with a kind of recursive function templates in C++, where the recursion is realized corresponding template parameters to achieve better computational performance. Some specialization of these template functions ends the…

Mathematical Software · Computer Science 2007-05-23 Volodymyr Myrnyy

Memory-Efficient Recursive Evaluation of 3-Center Gaussian Integrals

To improve the efficiency of Gaussian integral evaluation on modern accelerated architectures FLOP-efficient Obara-Saika-based recursive evaluation schemes are optimized for the memory footprint. For the 3-center 2-particle integrals that…

Computational Physics · Physics 2024-07-30 Andrey Asadchev , Edward F. Valeev

Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems

The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant…

Computer Vision and Pattern Recognition · Computer Science 2015-10-20 Shoaib Ehsan , Adrian F. Clark , Naveed ur Rehman , Klaus D. McDonald-Maier

Highly Versatile FPGA-Implemented Cyber Coherent Ising Machine

In recent years, quantum Ising machines have drawn a lot of attention, but due to physical implementation constraints, it has been difficult to achieve dense coupling, such as full coupling with sufficient spins to handle practical…

Hardware Architecture · Computer Science 2024-12-10 Toru Aonishi , Tatsuya Nagasawa , Toshiyuki Koizumi , Mastiyage Don Sudeera Hasaranga Gunathilaka , Kazushi Mimura , Masato Okada , Satoshi Kako , Yoshihisa Yamamoto

PSCNN: A 885.86 TOPS/W Programmable SRAM-based Computing-In-Memory Processor for Keyword Spotting

Computing-in-memory (CIM) has attracted significant attentions in recent years due to its massive parallelism and low power consumption. However, current CIM designs suffer from large area overhead of small CIM macros and bad programmablity…

Hardware Architecture · Computer Science 2022-05-04 Shu-Hung Kuo , Tian-Sheuan Chang

Pushing the Limit: A Hybrid Parallel Implementation of the Multi-resolution Approximation for Massive Data

The multi-resolution approximation (MRA) of Gaussian processes was recently proposed to conduct likelihood-based inference for massive spatial data sets. An advantage of the methodology is that it can be parallelized. We implemented the MRA…

Computation · Statistics 2019-05-07 Huang Huang , Lewis R. Blake , Dorit M. Hammerling

Efficient Floating-Point Arithmetic on Fault-Tolerant Quantum Computers

We propose a novel floating-point encoding scheme that builds on prior work involving fixed-point encodings. We encode floating-point numbers using Two's Complement fixed-point mantissas and Two's Complement integral exponents. We used our…

Quantum Physics · Physics 2025-10-24 José E. Cruz Serrallés , Oluwadara Ogunkoya , Do{g}a Murat Kürkçüo{g}lu , Nicholas Bornman , Norm M. Tubman , Anna Grassellino , Silvia Zorzetti , Riccardo Lattanzi

Fast Recursive Coding Based on Grouping of Symbols

A novel fast recursive coding technique is proposed. It operates with only integer values not longer 8 bits and is multiplication free. Recursion the algorithm is based on indirectly provides rather effective coding of symbols for very…

Information Theory · Computer Science 2007-08-22 Nikolay Ponomarenko , Vladimir Lukin , Karen Egiazarian , Jaakko Astola , Boris Y Ryabko

LSHR-Net: a hardware-friendly solution for high-resolution computational imaging using a mixed-weights neural network

Recent work showed neural-network-based approaches to reconstructing images from compressively sensed measurements offer significant improvements in accuracy and signal compression. Such methods can dramatically boost the capability of…

Image and Video Processing · Electrical Eng. & Systems 2020-04-29 Fangliang Bai , Jinchao Liu , Xiaojuan Liu , Margarita Osadchy , Chao Wang , Stuart J. Gibson

Converting an Integer to a Decimal String in Under Two Nanoseconds

Converting binary integers to variable-length decimal strings is a fundamental operation in computing. Conventional fast approaches rely on recursive division and small lookup tables. We propose a SIMD-based algorithm that leverages integer…

Data Structures and Algorithms · Computer Science 2026-05-07 Jaël Champagne Gareau , Daniel Lemire

Consistent Distributed Reactive Programming with Retroactive Computation

Context: Many systems require receiving data from multiple information sources, which act as distributed network devices that asynchronously send the latest data at their own pace to generalize various kinds of devices and connections,…

Programming Languages · Computer Science 2025-03-03 Tetsuo Kamina , Tomoyuki Aotani , Hidehiko Masuhara

Recursive Robust PCA or Recursive Sparse Recovery in Large but Structured Noise

This work studies the recursive robust principal components' analysis(PCA) problem. Here, "robust" refers to robustness to both independent and correlated sparse outliers. If the outlier is the signal-of-interest, this problem can be…

Information Theory · Computer Science 2014-08-20 Chenlu Qiu , Namrata Vaswani , Brian Lois , Leslie Hogben

Precision-Aware Iterative Algorithms Based on Group-Shared Exponents of Floating-Point Numbers

Iterative solvers are frequently used in scientific applications and engineering computations. However, the memory-bound Sparse Matrix-Vector (SpMV) kernel computation hinders the efficiency of iterative algorithms. As modern hardware…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-08 Jianhua Gao , Jiayuan Shen , Yuxiang Zhang , Weixing Ji , Hua Huang

Recursive Inference Scaling: A Winning Path to Scalable Inference in Language and Multimodal Systems

Inspired by recent findings on the fractal geometry of language, we introduce Recursive INference Scaling (RINS) as a complementary, plug-in recipe for scaling inference time in language and multimodal systems. RINS is a particular form of…

Artificial Intelligence · Computer Science 2025-05-09 Ibrahim Alabdulmohsin , Xiaohua Zhai