English
Related papers

Related papers: Efficient Multi-Cycle Folded Integer Multipliers

200 papers

Today every circuit has to face the power consumption issue for both portable device aiming at large battery life and high end circuits avoiding cooling packages and reliability issues that are too complex. It is generally accepted that…

Hardware Architecture · Computer Science 2010-07-15 C. N. Marimuthu , P. Thangaraj , Aswathy Ramesan

The data transfer between a processor and memory has become a design bottleneck in data-intensive applications. Processing-In-Memory (PIM) is a practical approach to overcome the memory wall bottleneck. The 4:2 compressor is suitable for…

Emerging Technologies · Computer Science 2024-07-16 Bahareh Bagheralmoosavi , Seyed Erfan Fatemieh , Mohammad Reza Reshadinezhad , Antonio Rubio

Multiple Constant Multiplication (MCM) over integers is a frequent operation arising in embedded systems that require highly optimized hardware. An efficient way is to replace costly generic multiplication by bit-shifts and additions, i.e.…

Hardware Architecture · Computer Science 2022-10-11 Rémi Garcia , Anastasia Volkova

Matrix multiplication is the dominant computation during Machine Learning (ML) inference. To efficiently perform such multiplication operations, Compute-in-memory (CiM) paradigms have emerged as a highly energy efficient solution. However,…

Hardware Architecture · Computer Science 2025-03-03 Tanvi Sharma , Mustafa Ali , Indranil Chakraborty , Kaushik Roy

Memristive Processing In-Memory (PIM) is one of the promising techniques for overcoming the Von-Neumann bottleneck. Reduction of data transfer between processor and memory and data processing by memristors in data-intensive applications…

Emerging Technologies · Computer Science 2024-10-15 Seyed Erfan Fatemieh , Bahareh Bagheralmoosavi , Mohammad Reza Reshadinezhad

Elliptic curve cryptography (ECC) has emerged as the dominant public-key protocol, with NIST standardizing parameters for binary field GF(2^m) ECC systems. This work presents a hardware implementation of a Hybrid Multiplication technique…

Cryptography and Security · Computer Science 2025-06-25 Ruby Kumari , Gaurav Purohit , Abhijit Karmakar

In recent years, various computing-in-memory (CIM) processors have been presented, showing superior performance over traditional architectures. To unleash the potential of various CIM architectures, such as device precision, crossbar size,…

Hardware Architecture · Computer Science 2024-05-09 Songyun Qu , Shixin Zhao , Bing Li , Yintao He , Xuyi Cai , Lei Zhang , Ying Wang

Given the stringent requirements of energy efficiency for Internet-of-Things edge devices, approximate multipliers, as a basic component of many processors and accelerators, have been constantly proposed and studied for decades, especially…

Hardware Architecture · Computer Science 2023-06-30 Ying Wu , Chuangtao Chen , Weihua Xiao , Xuan Wang , Chenyi Wen , Jie Han , Xunzhao Yin , Weikang Qian , Cheng Zhuo

Computing-in-Memory (CIM) accelerators are a promising solution for accelerating Machine Learning (ML) workloads, as they perform Matrix-Vector Multiplications (MVMs) on crossbar arrays directly in memory. Although the bit widths of the…

Machine Learning · Computer Science 2026-03-20 Rebecca Pelke , Joel Klein , Jose Cubero-Cascante , Nils Bosbach , Jan Moritz Joseph , Rainer Leupers

Matrix multiplications between asymmetric bit-width operands, especially between 8- and 4-bit operands are likely to become a fundamental kernel of many important workloads including neural networks and machine learning. While existing SIMD…

Machine Learning · Computer Science 2020-08-04 Dibakar Gope , Jesse Beu , Matthew Mattina

The ever-increasing quest for data-level parallelism and variable precision in ubiquitous multimedia and Deep Neural Network (DNN) applications has motivated the use of Single Instruction, Multiple Data (SIMD) architectures. To alleviate…

Hardware Architecture · Computer Science 2020-11-03 Zahra Ebrahimi , Salim Ullah , Akash Kumar

Multiplication is an indispensable operation in most of digital signal processing systems. Recently, many systems need to execute different types of algorithms on a multiplier. Therefore, it needs complicated computation and large area…

Hardware Architecture · Computer Science 2019-07-23 Seungbum Baek

This paper describes several new improvements of modular arithmetic and how to exploit them in order to gain more efficient implementations of commonly used algorithms, especially in cryptographic applications. We further present a new…

Cryptography and Security · Computer Science 2013-10-15 Wilke Trei

The ever-increasing size and computational complexity of today's machine-learning algorithms pose an increasing strain on the underlying hardware. In this light, novel and dedicated architectural solutions are required to optimize energy…

Hardware Architecture · Computer Science 2022-12-20 Pengbo Yu , Alexandre Levisse , Mohit Gupta , Evenblij Timon , Giovanni Ansaloni , Francky Catthoor , David Atienza

Single instruction, multiple data (SIMD) is a popular design style of in-memory computing (IMC) architectures, which enables memory arrays to perform logic operations to achieve low energy consumption and high parallelism. To implement a…

Emerging Technologies · Computer Science 2024-12-04 Xingyue Qian , Chen Nie , Zhezhi He , Weikang Qian

In this work faster unsigned multiplication has been achieved by using a combination of High Performance Multiplication [HPM] column reduction technique and implementing a N-bit multiplier using 4 N/2-bit multipliers (recursive…

Hardware Architecture · Computer Science 2011-10-20 V. Sreedeep , B. Ramkumar , Harish M Kittur

Approximate multipliers are widely being advocated for energy-efficient computing in applications that exhibit an inherent tolerance to inaccuracy. However, the inclusion of accuracy as a key design parameter, besides the performance, area…

Emerging Technologies · Computer Science 2018-03-20 Mahmoud Masadeh , Osman Hasan , Sofiene Tahar

This work presents a method to maximize power-efficiency of fixed point multiplier units by decomposing them into sub-components. First, an encoder block converts the operands from a two's complement to a sign magnitude representation,…

Neural and Evolutionary Computing · Computer Science 2025-07-25 Felix Arnold , Maxence Bouvier , Ryan Amaudruz , Renzo Andri , Lukas Cavigelli

Vector multiplication is a fundamental operation for AI acceleration, responsible for over 85% of computational load in convolution tasks. While essential, these operations are primary drivers of area, power, and delay in modern datapath…

Hardware Architecture · Computer Science 2026-02-24 Md Rownak Hossain Chowdhury , Mostafizur Rahman

While reduction in feature size makes computation cheaper in terms of latency, area, and power consumption, performance of emerging data-intensive applications is determined by data movement. These trends have introduced the concept of…

Hardware Architecture · Computer Science 2018-03-19 Bahar Asgari , Saibal Mukhopadhyay , Sudhakar Yalamanchili
‹ Prev 1 2 3 10 Next ›