Related papers: Algorithm-hardware Co-design for Deformable Convol…

CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs

Deploying deep learning models on embedded systems has been challenging due to limited computing resources. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, such as object…

Computer Vision and Pattern Recognition · Computer Science 2021-01-27 Zhen Dong , Dequan Wang , Qijing Huang , Yizhao Gao , Yaohui Cai , Tian Li , Bichen Wu , Kurt Keutzer , John Wawrzynek

An Efficient Accelerator Design Methodology for Deformable Convolutional Networks

Deformable convolutional networks have demonstrated outstanding performance in object recognition tasks with an effective feature extraction. Unlike standard convolution, the deformable convolution decides the receptive field size using…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-16 Saehyun Ahn , Jung-Woo Chang , Suk-Ju Kang

Overview of FPGA deep learning acceleration based on convolutional neural network

In recent years, deep learning has become more and more mature, and as a commonly used algorithm in deep learning, convolutional neural networks have been widely used in various visual tasks. In the past, research based on deep learning…

Artificial Intelligence · Computer Science 2020-12-24 Simin Liu

Design space exploration for image processing architectures on FPGA targets

Due to the emergence of embedded applications in image and video processing, communication and cryptography, improvement of pictorial information for better human perception like deblurring, denoising in several fields such as satellite…

Hardware Architecture · Computer Science 2014-04-16 Chandrajit Pal , Avik Kotal , Asit Samanta , Amlan Chakrabarti , Ranjan Ghosh

Real Time FPGA Based CNNs for Detection, Classification, and Tracking in Autonomous Systems: State of the Art Designs and Optimizations

This paper presents a comprehensive review of recent advances in deploying convolutional neural networks (CNNs) for object detection, classification, and tracking on Field Programmable Gate Arrays (FPGAs). With the increasing demand for…

Hardware Architecture · Computer Science 2025-09-05 Safa Mohammed Sali , Mahmoud Meribout , Ashiyana Abdul Majeed

Fixed-Point Convolutional Neural Network for Real-Time Video Processing in FPGA

Modern mobile neural networks with a reduced number of weights and parameters do a good job with image classification tasks, but even they may be too complex to be implemented in an FPGA for video processing tasks. The article proposes…

Computer Vision and Pattern Recognition · Computer Science 2020-12-04 Roman Solovyev , Alexander Kustov , Dmitry Telpukhov , Vladimir Rukhlov , Alexandr Kalinin

An FPGA-based Solution for Convolution Operation Acceleration

Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive…

Hardware Architecture · Computer Science 2023-02-28 Trung Dinh Pham , Bao Gia Bach , Lam Trinh Luu , Minh Dinh Nguyen , Hai Duc Pham , Khoa Bui Anh , Xuan Quang Nguyen , Cuong Pham Quoc

Real-Time Image Distortion Correction: Analysis and Evaluation of FPGA-Compatible Algorithms

Image distortion correction is a critical pre-processing step for a variety of computer vision and image processing algorithms. Standard real-time software implementations are generally not suited for direct hardware porting, so…

Computer Vision and Pattern Recognition · Computer Science 2016-11-01 Paolo Di Febbo , Stefano Mattoccia , Carlo Dal Mutto

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs

In recent years, Convolutional Neural Networks (ConvNets) have become an enabling technology for a wide range of novel embedded Artificial Intelligence systems. Across the range of applications, the performance needs vary significantly,…

Computer Vision and Pattern Recognition · Computer Science 2017-11-27 Stylianos I. Venieris , Christos-Savvas Bouganis

FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review

Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains. However, their growing complexity has significantly increased computational demands, necessitating efficient hardware…

Machine Learning · Computer Science 2025-05-21 Junye Jiang , Yaan Zhou , Yuanhao Gong , Haoxuan Yuan , Shuanglong Liu

Deformable ConvNets v2: More Deformable, Better Results

The superior performance of Deformable Convolutional Networks arises from its ability to adapt to the geometric variations of objects. Through an examination of its adaptive behavior, we observe that while the spatial support for its neural…

Computer Vision and Pattern Recognition · Computer Science 2018-11-29 Xizhou Zhu , Han Hu , Stephen Lin , Jifeng Dai

Real-Time Image Processing Algorithms for Embedded Systems

Embedded vision systems need efficient and robust image processing algorithms to perform real-time, with resource-constrained hardware. This research investigates image processing algorithms, specifically edge detection, corner detection,…

Image and Video Processing · Electrical Eng. & Systems 2026-01-13 Soundes Oumaima Boufaida , Abdemadjid Benmachiche , Majda Maatallah

A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration in Resource-Limited Edge Computing Applications?

When trained as generative models, Deep Learning algorithms have shown exceptional performance on tasks involving high dimensional data such as image denoising and super-resolution. In an increasingly connected world dominated by mobile and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-10 Ian Colbert , Jake Daly , Ken Kreutz-Delgado , Srinjoy Das

Towards On-Chip Optical FFTs for Convolutional Neural Networks

Convolutional neural networks have become an essential element of spatial deep learning systems. In the prevailing architecture, the convolution operation is performed with Fast Fourier Transforms (FFT) electronically in GPUs. The…

Emerging Technologies · Computer Science 2017-09-01 Jonathan George , Hani Nejadriahi , Volker Sorger

Event-based vision on FPGAs -- a survey

In recent years there has been a growing interest in event cameras, i.e. vision sensors that record changes in illumination independently for each pixel. This type of operation ensures that acquisition is possible in very adverse lighting…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Tomasz Kryjak

Real Time FPGA Based Transformers & VLMs for Vision Tasks: SOTA Designs and Optimizations

Transformers and vision-language models (VLMs) have emerged as dominant architectures in computer vision and multimodal AI, offering state-of-the-art performance in tasks such as image classification, object detection, visual question…

Hardware Architecture · Computer Science 2025-09-05 Safa Mohammed Sali , Mahmoud Meribout , Ashiyana Abdul Majeed

Accelerating CNN inference on FPGAs: A Survey

Convolutional Neural Networks (CNNs) are currently adopted to solve an ever greater number of problems, ranging from speech recognition to image classification and segmentation. The large amount of processing required by CNNs calls for…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-06 Kamel Abdelouahab , Maxime Pelcat , Jocelyn Serot , François Berry

GPU Acceleration of Image Convolution using Spatially-varying Kernel

Image subtraction in astronomy is a tool for transient object discovery such as asteroids, extra-solar planets and supernovae. To match point spread functions (PSFs) between images of the same field taken at different times a convolution…

Instrumentation and Methods for Astrophysics · Physics 2013-05-30 Steven Hartung , Hemant Shukla , J. Patrick Miller , Carlton Pennypacker

FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review

Due to recent advances in digital technologies, and availability of credible data, an area of artificial intelligence, deep learning, has emerged, and has demonstrated its ability and effectiveness in solving complex learning problems not…

Neural and Evolutionary Computing · Computer Science 2019-01-03 Ahmad Shawahna , Sadiq M. Sait , Aiman El-Maleh

Beyond the GPU: The Strategic Role of FPGAs in the Next Wave of AI

AI acceleration has been dominated by GPUs, but the growing need for lower latency, energy efficiency, and fine-grained hardware control exposes the limits of fixed architectures. In this context, Field-Programmable Gate Arrays (FPGAs)…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-18 Arturo Urías Jiménez