Related papers: An FPGA-based Solution for Convolution Operation A…

FPGA deep learning acceleration based on convolutional neural network

In view of the large amount of calculation and long calculation time of convolutional neural network (CNN), this paper proposes a convolutional neural network hardware accelerator based on field programmable logic gate array (FPGA). First,…

Hardware Architecture · Computer Science 2020-12-08 Xiong Jun

A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks

Intensive computation is entering data centers with multiple workloads of deep learning. To balance the compute efficiency, performance, and total cost of ownership (TCO), the use of a field-programmable gate array (FPGA) with…

Computer Vision and Pattern Recognition · Computer Science 2019-09-19 Xiaoyu Yu , Yuwei Wang , Jie Miao , Ephrem Wu , Heng Zhang , Yu Meng , Bo Zhang , Biao Min , Dewei Chen , Jianlin Gao

Algorithm-hardware Co-design for Deformable Convolution

FPGAs provide a flexible and efficient platform to accelerate rapidly-changing algorithms for computer vision. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, including…

Image and Video Processing · Electrical Eng. & Systems 2020-03-25 Qijing Huang , Dequan Wang , Yizhao Gao , Yaohui Cai , Zhen Dong , Bichen Wu , Kurt Keutzer , John Wawrzynek

A Reconfigurable Vector Instruction Processor for Accelerating a Convection Parametrization Model on FPGAs

High Performance Computing (HPC) platforms allow scientists to model computationally intensive algorithms. HPC clusters increasingly use General-Purpose Graphics Processing Units (GPGPUs) as accelerators; FPGAs provide an attractive…

Hardware Architecture · Computer Science 2015-04-20 Syed Waqar Nabi , Saji N. Hameed , Wim Vanderbauwhede

FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review

Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains. However, their growing complexity has significantly increased computational demands, necessitating efficient hardware…

Machine Learning · Computer Science 2025-05-21 Junye Jiang , Yaan Zhou , Yuanhao Gong , Haoxuan Yuan , Shuanglong Liu

An FPGA-Based Accelerator Enabling Efficient Support for CNNs with Arbitrary Kernel Sizes

Convolutional neural networks (CNNs) with large kernels, drawing inspiration from the key operations of vision transformers (ViTs), have demonstrated impressive performance in various vision-based applications. To address the issue of…

Hardware Architecture · Computer Science 2024-02-23 Miaoxin Wang , Xiao Wu , Jun Lin , Zhongfeng Wang

Overview of FPGA deep learning acceleration based on convolutional neural network

In recent years, deep learning has become more and more mature, and as a commonly used algorithm in deep learning, convolutional neural networks have been widely used in various visual tasks. In the past, research based on deep learning…

Artificial Intelligence · Computer Science 2020-12-24 Simin Liu

FPGA-Optimized Hardware Accelerator for Fast Fourier Transform and Singular Value Decomposition in AI

This research introduces an FPGA-based hardware accelerator to optimize the Singular Value Decomposition (SVD) and Fast Fourier transform (FFT) operations in AI models. The proposed design aims to improve processing speed and reduce…

Hardware Architecture · Computer Science 2025-04-15 Hong Ding , Chia Chao Kang , SuYang Xi , Zehang Liu , Xuan Zhang , Yi Ding

FPGA Based Accelerator for Neural Networks Computation with Flexible Pipelining

FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an…

Hardware Architecture · Computer Science 2022-01-03 Qingyang Yi , Heming Sun , Masahiro Fujita

A Resource-Driven Approach for Implementing CNNs on FPGAs Using Adaptive IPs

The increasing demand for real-time, low-latency artificial intelligence applications has propelled the use of Field-Programmable Gate Arrays (FPGAs) for Convolutional Neural Network (CNN) implementations. FPGAs offer reconfigurability,…

Hardware Architecture · Computer Science 2025-10-06 Philippe Magalhães , Virginie Fresse , Benoît Suffran , Olivier Alata

A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks

FPGA-based hardware accelerators for convolutional neural networks (CNNs) have obtained great attentions due to their higher energy efficiency than GPUs. However, it is challenging for FPGA-based solutions to achieve a higher throughput…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-09 Yixing Li , Zichuan Liu , Kai Xu , Hao Yu , Fengbo Ren

FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge

While embedded FPGAs are attractive platforms for DNN acceleration on edge-devices due to their low latency and high energy efficiency, the scarcity of resources of edge-scale FPGA devices also makes it challenging for DNN deployment. In…

Computer Vision and Pattern Recognition · Computer Science 2019-04-10 Cong Hao , Xiaofan Zhang , Yuhong Li , Sitao Huang , Jinjun Xiong , Kyle Rupnow , Wen-mei Hwu , Deming Chen

A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration in Resource-Limited Edge Computing Applications?

When trained as generative models, Deep Learning algorithms have shown exceptional performance on tasks involving high dimensional data such as image denoising and super-resolution. In an increasingly connected world dominated by mobile and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-10 Ian Colbert , Jake Daly , Ken Kreutz-Delgado , Srinjoy Das

Are FPGAs Suitable for Edge Computing?

The rapid growth of Internet-of-things (IoT) and artificial intelligence applications have called forth a new computing paradigm--edge computing. In this paper, we study the suitability of deploying FPGAs for edge computing from the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-04-19 Saman Biookaghazadeh , Fengbo Ren , Ming Zhao

A Soft Processor Overlay with Tightly-coupled FPGA Accelerator

FPGA overlays are commonly implemented as coarse-grained reconfigurable architectures with a goal to improve designers' productivity through balancing flexibility and ease of configuration of the underlying fabric. To truly facilitate full…

Hardware Architecture · Computer Science 2016-06-22 Ho-Cheung Ng , Cheng Liu , Hayden Kwok-Hay So

Enabling OpenMP Task Parallelism on Multi-FPGAs

FPGA-based hardware accelerators have received increasing attention mainly due to their ability to accelerate deep pipelined applications, thus resulting in higher computational performance and energy efficiency. Nevertheless, the amount of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-23 R. Nepomuceno , R. Sterle , G. Valarini , M. Pereira , H. Yviquel , G. Araujo

It's all about data movement: Optimising FPGA data access to boost performance

The use of reconfigurable computing, and FPGAs in particular, to accelerate computational kernels has the potential to be of great benefit to scientific codes and the HPC community in general. However, whilst recent advanced in FPGA tooling…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-06 Nick Brown , David Dolman

FFCNN: Fast FPGA based Acceleration for Convolution neural network inference

We present a new efficient OpenCL-based Accelerator for large scale Convolutional Neural Networks called Fast Inference on FPGAs for Convolution Neural Network (FFCNN). FFCNN is based on a deeply pipelined OpenCL kernels architecture. As…

Machine Learning · Computer Science 2022-08-30 F. Keddous , H-N. Nguyen , A. Nakib

PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks

Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being compute-intensive, CNN computations are mainly accelerated by GPUs with high…

Hardware Architecture · Computer Science 2016-11-09 Dong Wang , Jianjing An , Ke Xu

Improving Performance Estimation for FPGA-based Accelerators for Convolutional Neural Networks

Field-programmable gate array (FPGA) based accelerators are being widely used for acceleration of convolutional neural networks (CNNs) due to their potential in improving the performance and reconfigurability for specific application…

Image and Video Processing · Electrical Eng. & Systems 2020-02-04 Martin Ferianc , Hongxiang Fan , Ringo S. W. Chu , Jakub Stano , Wayne Luk