Related papers: An Improved GEF Fast Addition Algorithm
We consider the problem of constructing fast and small parallel prefix adders for non-uniform input arrival times. This problem arises whenever the adder is embedded into a more complex circuit, e. g. a multiplier. Most previous results are…
We present a novel fast bipartitioned hybrid adder (FBHA) that utilizes carry-select and carry-lookahead logic. The proposed FBHA is an accurate adder with a significant part and a less significant part joined together by a carry signal. In…
Many concurrent algorithms require processes to perform fetch-and-add operations on a single memory location, which can be a hot spot of contention. We present a novel algorithm called Aggregating Funnels that reduces this contention by…
Approximate computing has in recent times found significant applications towards lowering power, area, and time requirements for arithmetic operations. Several works done in recent years have furthered approximate computing along these…
This paper describes a new accumulate-and-add multiplication algorithm. The method partitions one of the operands and re-combines the results of computations done with each of the partitions. The resulting design turns-out to be both…
An integer adder for integers in the binary representation is one of the basic operations of any digital processor. For adding two integers of N bits each, the serial adder takes as many clock ticks. For achieving higher speeds, parallel…
A new Combined Sieve algorithm is presented with cost proportional to the number of enumerated factors over a series of intervals. This algorithm achieves a significant speedup, over a traditional sieve, when handling many ([10^4, 10^7])…
The increasing complexity of transformer models in artificial intelligence expands their computational costs, memory usage, and energy consumption. Hardware acceleration tackles the ensuing challenges by designing processors and…
The paper presents a systematic study and implementation of a reconfigurable combinatorial multi-operand adder for use in Deep Learning systems. The size of carry changes with the number of operands and hence a reliable algorithm to…
Sorting algorithms are the deciding factor for the performance of common operations such as removal of duplicates or database sort-merge joins. This work focuses on 32-bit integer keys, optionally paired with a 32-bit value. We present a…
This paper presents efficient algorithms, designed to leverage SIMD for performing Montgomery reductions and additions on integers larger than 512 bits. The existing algorithms encounter inefficiencies when parallelized using SIMD due to…
This paper provides an introduction to the design of augmented data structures that offer an efficient representation of a mathematical sequence and fast sequential summation algorithms, which guarantee both logarithmic running time and…
This paper discusses about a sorting algorithm which uses the concept of buckets where each bucket represents a certain number of digits. A two dimensional data structure is used where one dimension represents buckets i. e; number of digits…
Binary addition is one of the most primitive and most commonly used applications in computer arithmetic. A large variety of algorithms and implementations have been proposed for binary addition. Huey Ling proposed a simpler form of CLA…
We show a new algorithm and its implementation for multiplying bit-polynomials of large degrees. The algorithm is based on evaluating polynomials at a specific set comprising a natural set for evaluation with additive FFT and a high order…
We present an algorithmic contribution to improve the efficiency of robust trim-fitting in outlier affected geometric regression problems. The method heavily relies on the quick sort algorithm, and we present two important insights. First,…
It may be possible to extend the Grover search algorithm by taking a divide and conquer approach using auxiliary solutions to achieve an exponential speed-up.
Fast binary compressors are the main components of many basic digital calculation units. In this paper, a high-speed (7,2) compressor with a fast carry-generation logic is proposed. The carry-generation logic is based on the sorting…
Enhancing the efficiency of iterative computation on graphs has garnered considerable attention in both industry and academia. Nonetheless, the majority of efforts focus on expediting iterative computation by minimizing the running time per…
In this paper we give a fast algorithm to generate all partitions of a positive integer $n$. Integer partitions may be encoded as either ascending or descending compositions for the purposes of systematic generation. It is known that the…