Related papers: A Parameterizable Convolution Accelerator for Embe…

FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review

Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains. However, their growing complexity has significantly increased computational demands, necessitating efficient hardware…

Machine Learning · Computer Science 2025-05-21 Junye Jiang , Yaan Zhou , Yuanhao Gong , Haoxuan Yuan , Shuanglong Liu

Software-Defined FPGA Accelerator Design for Mobile Deep Learning Applications

Recently, the field of deep learning has received great attention by the scientific community and it is used to provide improved solutions to many computer vision problems. Convolutional neural networks (CNNs) have been successfully used to…

Computer Vision and Pattern Recognition · Computer Science 2019-03-26 Panagiotis G. Mousouliotis , Loukas P. Petrou

Optimization of FPGA-based CNN Accelerators Using Metaheuristics

In recent years, convolutional neural networks (CNNs) have demonstrated their ability to solve problems in many fields and with accuracy that was not possible before. However, this comes with extensive computational requirements, which made…

Neural and Evolutionary Computing · Computer Science 2022-09-26 Sadiq M. Sait , Aiman El-Maleh , Mohammad Altakrouri , Ahmad Shawahna

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA

In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been…

Machine Learning · Computer Science 2017-05-09 Xinyu Zhang , Srinjoy Das , Ojash Neopane , Ken Kreutz-Delgado

FPGA deep learning acceleration based on convolutional neural network

In view of the large amount of calculation and long calculation time of convolutional neural network (CNN), this paper proposes a convolutional neural network hardware accelerator based on field programmable logic gate array (FPGA). First,…

Hardware Architecture · Computer Science 2020-12-08 Xiong Jun

Hardware-Efficient Template-Based Deep CNNs Accelerator Design

Acceleration of Convolutional Neural Network (CNN) on edge devices has recently achieved a remarkable performance in image classification and object detection applications. This paper proposes an efficient and scalable CNN-based SoC-FPGA…

Hardware Architecture · Computer Science 2022-07-29 Azzam Alhussain , Mingjie Lin

A CNN Accelerator on FPGA Using Depthwise Separable Convolution

Convolutional neural networks (CNNs) have been widely deployed in the fields of computer vision and pattern recognition because of their high accuracy. However, large convolution operations are computing-intensive that often requires a…

Signal Processing · Electrical Eng. & Systems 2018-09-10 Lin Bai , Yiming Zhao , Xinming Huang

FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review

Due to recent advances in digital technologies, and availability of credible data, an area of artificial intelligence, deep learning, has emerged, and has demonstrated its ability and effectiveness in solving complex learning problems not…

Neural and Evolutionary Computing · Computer Science 2019-01-03 Ahmad Shawahna , Sadiq M. Sait , Aiman El-Maleh

Reconfigurable co-processor architecture with limited numerical precision to accelerate deep convolutional neural networks

Convolutional Neural Networks (CNNs) are widely used in deep learning applications, e.g. visual systems, robotics etc. However, existing software solutions are not efficient. Therefore, many hardware accelerators have been proposed…

Machine Learning · Computer Science 2021-09-08 Sasindu Wijeratne , Sandaruwan Jayaweera , Mahesh Dananjaya , Ajith Pasqual

TinyCNN: A Tiny Modular CNN Accelerator for Embedded FPGA

In recent years, Convolutional Neural Network (CNN) based methods have achieved great success in a large number of applications and have been among the most powerful and widely used techniques in computer vision. However, CNN-based methods…

Machine Learning · Computer Science 2019-11-18 Ali Jahanshahi

PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks

Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being compute-intensive, CNN computations are mainly accelerated by GPUs with high…

Hardware Architecture · Computer Science 2016-11-09 Dong Wang , Jianjing An , Ke Xu

A flexible FPGA accelerator for convolutional neural networks

Though CNNs are highly parallel workloads, in the absence of efficient on-chip memory reuse techniques, an accelerator for them quickly becomes memory bound. In this paper, we propose a CNN accelerator design for inference that is able to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-26 Kingshuk Majumder , Shubham Nema , Uday Bondhugula

A scalable and efficient convolutional neural network accelerator using HLS for a System on Chip design

This paper presents a configurable Convolutional Neural Network Accelerator (CNNA) for a System on Chip design (SoC). The goal was to accelerate inference of different deep learning networks on an embedded SoC platform. The presented CNNA…

Computer Vision and Pattern Recognition · Computer Science 2020-10-08 Kim Bjerge , Jonathan Horsted Schougaard , Daniel Ejnar Larsen

Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA

Convolutional Neural Networks (CNNs) reach high accuracies in various application domains, but require large amounts of computation and incur costly data movements. One method to decrease these costs while trading accuracy is weight and/or…

Hardware Architecture · Computer Science 2022-08-10 Cecilia Latotzke , Tim Ciesielski , Tobias Gemmeke

Accelerating CNN inference on FPGAs: A Survey

Convolutional Neural Networks (CNNs) are currently adopted to solve an ever greater number of problems, ranging from speech recognition to image classification and segmentation. The large amount of processing required by CNNs calls for…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-06 Kamel Abdelouahab , Maxime Pelcat , Jocelyn Serot , François Berry

Impl\'ementation Efficiente de Fonctions de Convolution sur FPGA \`a l'Aide de Blocs Param\'etrables et d'Approximations Polynomiales

Implementing convolutional neural networks (CNNs) on field-programmable gate arrays (FPGAs) has emerged as a promising alternative to GPUs, offering lower latency, greater power efficiency and greater flexibility. However, this development…

Hardware Architecture · Computer Science 2025-10-21 Philippe Magalhães , Virginie Fresse , Benoît Suffran , Olivier Alata

Automatic Compiler Based FPGA Accelerator for CNN Training

Training of convolutional neural networks (CNNs)on embedded platforms to support on-device learning is earning vital importance in recent days. Designing flexible training hard-ware is much more challenging than inference hardware, due to…

Machine Learning · Computer Science 2019-08-20 Shreyas Kolala Venkataramanaiah , Yufei Ma , Shihui Yin , Eriko Nurvithadhi , Aravind Dasu , Yu Cao , Jae-sun Seo

Continuous-Flow Data-Rate-Aware CNN Inference on FPGA

Among hardware accelerators for deep-learning inference, data flow implementations offer low latency and high throughput capabilities. In these architectures, each neuron is mapped to a dedicated hardware unit, making them well-suited for…

Machine Learning · Computer Science 2026-03-10 Tobias Habermann , Michael Mecik , Zhenyu Wang , César David Vera , Martin Kumm , Mario Garrido

Improving Performance Estimation for FPGA-based Accelerators for Convolutional Neural Networks

Field-programmable gate array (FPGA) based accelerators are being widely used for acceleration of convolutional neural networks (CNNs) due to their potential in improving the performance and reconfigurability for specific application…

Image and Video Processing · Electrical Eng. & Systems 2020-02-04 Martin Ferianc , Hongxiang Fan , Ringo S. W. Chu , Jakub Stano , Wayne Luk

A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks

Intensive computation is entering data centers with multiple workloads of deep learning. To balance the compute efficiency, performance, and total cost of ownership (TCO), the use of a field-programmable gate array (FPGA) with…

Computer Vision and Pattern Recognition · Computer Science 2019-09-19 Xiaoyu Yu , Yuwei Wang , Jie Miao , Ephrem Wu , Heng Zhang , Yu Meng , Bo Zhang , Biao Min , Dewei Chen , Jianlin Gao