Related papers: Fast Parallel Integer Adder in Binary Representati…
Today's PCs can directly manipulate numbers not longer than 64 bits because the size of the CPU registers and the data-path are limited. Consequently, arithmetic operations such as addition, can only be performed on numbers of that length.…
We consider the problem of constructing fast and small parallel prefix adders for non-uniform input arrival times. This problem arises whenever the adder is embedded into a more complex circuit, e. g. a multiplier. Most previous results are…
We present efficient circuits for the addition of binary numbers. We assume that we are given arrival times for all input bits and optimize the delay of the circuits, i.e.\ the time when the last output bit is computed. This contains the…
Optimization techniques for decreasing the time and area of adder circuits have been extensively studied for years mostly in binary logic system. In this paper, we provide the necessary equations required to design a full adder in…
The paper presents a systematic study and implementation of a reconfigurable combinatorial multi-operand adder for use in Deep Learning systems. The size of carry changes with the number of operands and hence a reliable algorithm to…
Binary addition is one of the most primitive and most commonly used applications in computer arithmetic. A large variety of algorithms and implementations have been proposed for binary addition. Huey Ling proposed a simpler form of CLA…
We study parallel algorithms for addition of numbers having finite representation in a positional numeration system defined by a base $\beta$ in $\mathbb{C}$ and a finite digit set $\mathcal{A}$ of contiguous integers containing $0$. For a…
Many emerging computer applications require the processing of large numbers, larger than what a CPU can handle. In fact, the top of the line PCs can only manipulate numbers not longer than 32 bits or 64 bits. This is due to the size of the…
This paper presents efficient algorithms, designed to leverage SIMD for performing Montgomery reductions and additions on integers larger than 512 bits. The existing algorithms encounter inefficiencies when parallelized using SIMD due to…
The implementation of a quaternary 1-digit adder composed of a 2-bit binary adder, quaternary to binary decoders and binary to quaternary encoders is compared with several recent implementations of quaternary adders. This simple…
We design and implement parallel prefix sum (scan) algorithms using Ascend AI accelerators. Ascend accelerators feature specialized computing units: the cube units for efficient matrix multiplication and the vector units for optimized…
We find a succinct expression for computing the sequence $x_t = a_t x_{t-1} + b_t$ in parallel with two prefix sums, given $t = (1, 2, \dots, n)$, $a_t \in \mathbb{R}^n$, $b_t \in \mathbb{R}^n$, and initial value $x_0 \in \mathbb{R}$. On…
Let $\{x_1, x_2, ..., x_n\}$ be a vector of real numbers. An integer relation algorithm is a computational scheme to find the $n$ integers $a_k$, if they exist, such that $a_1 x_1 + a_2 x_2 + ... + a_n x_n= 0$. In the past few years,…
The quantum adder is an essential attribute of a quantum computer, just as classical adder is needed for operation of a digital computer. We model the quantum full adder as a realistic complex algorithm on a large number of qubits in an…
Many parallel algorithms which solve basic problems in computer science use auxiliary space linear in the input to facilitate conflict-free computation. There has been significant work on improving these parallel algorithms to be in-place,…
Fast binary compressors are the main components of many basic digital calculation units. In this paper, a high-speed (7,2) compressor with a fast carry-generation logic is proposed. The carry-generation logic is based on the sorting…
Quantum computers require quantum processors. An important part of the processor of any computer is the arithmetic unit, which performs binary addition, subtraction, division and multiplication, however multiplication can be performed using…
Converting binary integers to variable-length decimal strings is a fundamental operation in computing. Conventional fast approaches rely on recursive division and small lookup tables. We propose a SIMD-based algorithm that leverages integer…
In this work faster unsigned multiplication has been achieved by using a combination of High Performance Multiplication [HPM] column reduction technique and implementing a N-bit multiplier using 4 N/2-bit multipliers (recursive…
There are three main types of numerical computations for the Bessel function of the second kind: series expansion, continued fraction, and asymptotic expansion. In addition, they are combined in the appropriate domain for each. However,…