Related papers: From Circuits to SoC Processors: Arithmetic Approx…

Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications

The challenging deployment of compute-intensive applications from domains such as Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of computing systems to explore new design approaches. Approximate…

Hardware Architecture · Computer Science 2025-03-21 Vasileios Leon , Muhammad Abdullah Hanif , Giorgos Armeniakos , Xun Jiao , Muhammad Shafique , Kiamal Pekmestzi , Dimitrios Soudris

Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques

The rapid growth of demanding applications in domains applying multimedia processing and machine learning has marked a new era for edge and cloud computing. These applications involve massive data and compute-intensive tasks, and thus,…

Hardware Architecture · Computer Science 2025-03-21 Vasileios Leon , Muhammad Abdullah Hanif , Giorgos Armeniakos , Xun Jiao , Muhammad Shafique , Kiamal Pekmestzi , Dimitrios Soudris

ApproxGNN: A Pretrained GNN for Parameter Prediction in Design Space Exploration for Approximate Computing

Approximate computing offers promising energy efficiency benefits for error-tolerant applications, but discovering optimal approximations requires extensive design space exploration (DSE). Predicting the accuracy of circuits composed of…

Hardware Architecture · Computer Science 2026-03-20 Ondrej Vlcek , Vojtech Mrazek

A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits

Given the stringent requirements of energy efficiency for Internet-of-Things edge devices, approximate multipliers, as a basic component of many processors and accelerators, have been constantly proposed and studied for decades, especially…

Hardware Architecture · Computer Science 2023-06-30 Ying Wu , Chuangtao Chen , Weihua Xiao , Xuan Wang , Chenyi Wen , Jie Han , Xunzhao Yin , Weikang Qian , Cheng Zhuo

Heterogeneous FPGA+GPU Embedded Systems: Challenges and Opportunities

The edge computing paradigm has emerged to handle cloud computing issues such as scalability, security and low response time among others. This new computing trend heavily relies on ubiquitous embedded systems on the edge. Performance and…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-28 Mohammad Hosseinabady , Mohd Amiruddin Bin Zainol , Jose Nunez-Yanez

ApproxFPGAs: Embracing ASIC-Based Approximate Arithmetic Components for FPGA-Based Systems

There has been abundant research on the development of Approximate Circuits (ACs) for ASICs. However, previous studies have illustrated that ASIC-based ACs offer asymmetrical gains in FPGA-based accelerators. Therefore, an AC that might be…

Hardware Architecture · Computer Science 2020-12-29 Bharath Srinivas Prabakaran , Vojtech Mrazek , Zdenek Vasicek , Lukas Sekanina , Muhammad Shafique

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

Given their increasing size and complexity, the need for efficient execution of deep neural networks has become increasingly pressing in the design of heterogeneous High-Performance Computing (HPC) and edge platforms, leading to a wide…

Hardware Architecture · Computer Science 2025-05-23 Serena Curzel , Fabrizio Ferrandi , Leandro Fiorin , Daniele Ielmini , Cristina Silvano , Francesco Conti , Luca Bompani , Luca Benini , Enrico Calore , Sebastiano Fabio Schifano , Cristian Zambelli , Maurizio Palesi , Giuseppe Ascia , Enrico Russo , Valeria Cardellini , Salvatore Filippone , Francesco Lo Presti , Stefania Perri

TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU

Energy efficiency of hardware accelerators of deep neural networks (DNN) can be improved by introducing approximate arithmetic circuits. In order to quantify the error introduced by using these circuits and avoid the expensive hardware…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-03 Filip Vaverka , Vojtech Mrazek , Zdenek Vasicek , Lukas Sekanina

Principled Approximation Methods for Efficient and Scalable Deep Learning

Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep…

Machine Learning · Computer Science 2025-09-16 Pedro Savarese

HADES: Hardware/Algorithm Co-design in DNN accelerators using Energy-efficient Approximate Alphabet Set Multipliers

Edge computing must be capable of executing computationally intensive algorithms, such as Deep Neural Networks (DNNs) while operating within a constrained computational resource budget. Such computations involve Matrix Vector…

Hardware Architecture · Computer Science 2023-10-24 Arani Roy , Kaushik Roy

Deep Neural Network Approximation for Custom Hardware: Where We've Been, Where We're Going

Deep neural networks have proven to be particularly effective in visual and audio recognition tasks. Existing models tend to be computationally expensive and memory intensive, however, and so methods for hardware-oriented approximation have…

Computer Vision and Pattern Recognition · Computer Science 2019-07-09 Erwei Wang , James J. Davis , Ruizhe Zhao , Ho-Cheung Ng , Xinyu Niu , Wayne Luk , Peter Y. K. Cheung , George A. Constantinides

Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have brought beyond human accuracy in many tasks, but at the cost of high…

Hardware Architecture · Computer Science 2022-03-18 Giorgos Armeniakos , Georgios Zervakis , Dimitrios Soudris , Jörg Henkel

A Study of Performance Programming of CPU, GPU accelerated Computers and SIMD Architecture

Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-18 Xinyao Yi

autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-13 Vojtech Mrazek , Muhammad Abdullah Hanif , Zdenek Vasicek , Lukas Sekanina , Muhammad Shafique

Near-Precise Parameter Approximation for Multiple Multiplications on A Single DSP Block

A multiply-accumulate (MAC) operation is the main computation unit for DSP applications. DSP blocks are one of the efficient solutions to implement MACs in FPGA's. However, since the DSP blocks have wide multiplier and adder blocks, MAC…

Hardware Architecture · Computer Science 2021-10-26 Ercan Kalali , Rene van Leuken

On Hardware-efficient Inference in Probabilistic Circuits

Probabilistic circuits (PCs) offer a promising avenue to perform embedded reasoning under uncertainty. They support efficient and exact computation of various probabilistic inference tasks by design. Hence, hardware-efficient computation of…

Machine Learning · Computer Science 2024-05-24 Lingyun Yao , Martin Trapp , Jelin Leslin , Gaurav Singh , Peng Zhang , Karthekeyan Periasamy , Martin Andraud

IMPLY-based Approximate Full Adders for Efficient Arithmetic Operations in Image Processing and Machine Learning

To overcome the performance limitations in modern computing, such as the power wall, emerging computing paradigms are gaining increasing importance. Approximate computing offers a promising solution by substantially enhancing energy…

Emerging Technologies · Computer Science 2024-12-23 Melanie Qiu , Caoyueshan Fan , Gulafshan , Salar Shakibhamedan , Fabian Seiler , Nima TaheriNejad

Approximate Early Output Asynchronous Adders Based on Dual-Rail Data Encoding and 4-Phase Return-to-Zero and Return-to-One Handshaking

Approximate computing is emerging as an alternative to accurate computing due to its potential for realizing digital circuits and systems with low power dissipation, less critical path delay, and less area occupancy for an acceptable…

Hardware Architecture · Computer Science 2018-01-19 P Balasubramanian

ASAP: Accelerated Short-Read Alignment on Programmable Hardware

The proliferation of high-throughput sequencing machines ensures rapid generation of up to billions of short nucleotide fragments in a short period of time. This massive amount of sequence data can quickly overwhelm today's storage and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-13 Subho S. Banerjee , Mohamed El-Hadedy , Jong Bin Lim , Zbigniew T. Kalbarczyk , Deming Chen , Steve Lumetta , Ravishankar K. Iyer

Accelerating Exact and Approximate Inference for (Distributed) Discrete Optimization with GPUs

Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including (W)CSP, DCOP, as well as optimization in stochastic…

Artificial Intelligence · Computer Science 2018-01-12 Ferdinando Fioretto , Enrico Pontelli , William Yeoh , Rina Dechter