English
Related papers

Related papers: Fast Convolutional Nets With fbfft: A GPU Performa…

200 papers

Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a…

Computer Vision and Pattern Recognition · Computer Science 2015-06-09 Michael Mathieu , Mikael Henaff , Yann LeCun

Fast Fourier Transform (FFT) is an essential tool in scientific and engineering computation. The increasing demand for mixed-precision FFT has made it possible to utilize half-precision floating-point (FP16) arithmetic for faster speed and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-26 Binrui Li , Shenggan Cheng , James Lin

Convolutional neural networks have become an essential element of spatial deep learning systems. In the prevailing architecture, the convolution operation is performed with Fast Fourier Transforms (FFT) electronically in GPUs. The…

Emerging Technologies · Computer Science 2017-09-01 Jonathan George , Hani Nejadriahi , Volker Sorger

This paper proposes to use Fast Fourier Transformation-based U-Net (a refined fully convolutional networks) and perform image convolution in neural networks. Leveraging the Fast Fourier Transformation, it reduces the image convolution costs…

Computer Vision and Pattern Recognition · Computer Science 2020-10-12 Varsha Nair , Moitrayee Chatterjee , Neda Tavakoli , Akbar Siami Namin , Craig Snoeyink

GPU-based fast Fourier transform (FFT) is extremely important for scientific computing and signal processing. However, we find the inefficiency of existing FFT libraries and the absence of fault tolerance against soft error. To address…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-10 Shixun Wu , Yujia Zhai , Jinyang Liu , Jiajun Huang , Zizhe Jian , Huangliang Dai , Sheng Di , Franck Cappello , Zizhong Chen

Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and hence, these are largely used in…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-28 Marc Jordà , Pedro Valero-Lara , Antonio J. Peña

The technologically-relevant task of feature extraction from data performed in deep-learning systems is routinely accomplished as repeated fast Fourier transforms (FFT) electronically in prevalent domain-specific architectures such as in…

Convolutional neural networks (CNNs) have a large number of variables and hence suffer from a complexity problem for their implementation. Different methods and techniques have developed to alleviate the problem of CNN's complexity, such as…

Computer Vision and Pattern Recognition · Computer Science 2020-04-07 Kamran Chitsaz , Mohsen Hajabdollahi , Nader Karimi , Shadrokh Samavi , Shahram Shirani

Deep Convolutional Neural Networks have become a Swiss knife in solving critical artificial intelligence tasks. However, deploying deep CNN models for latency-critical tasks remains to be challenging because of the complex nature of CNNs.…

Computer Vision and Pattern Recognition · Computer Science 2018-03-28 Chuanhao Zhuge , Xinheng Liu , Xiaofan Zhang , Sudeep Gummadi , Jinjun Xiong , Deming Chen

FPGA-based hardware accelerators for convolutional neural networks (CNNs) have obtained great attentions due to their higher energy efficiency than GPUs. However, it is challenging for FPGA-based solutions to achieve a higher throughput…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-09 Yixing Li , Zichuan Liu , Kai Xu , Hao Yu , Fengbo Ren

Winograd-based convolution has quickly gained traction as a preferred approach to implement convolutional neural networks (ConvNet) on various hardware platforms because it requires fewer floating point operations than FFT-based or direct…

Performance · Computer Science 2018-09-24 Aleksandar Zlateski , Zhen Jia , Kai Li , Fredo Durand

The convolution computation is widely used in many fields, especially in CNNs. Because of the rapid growth of the training data in CNNs, GPUs have been used for the acceleration, and memory-efficient algorithms are focused because of thier…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-02 Qiong Chang , Masaki Onishi , Tsutomu Maruyama

Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains. However, their growing complexity has significantly increased computational demands, necessitating efficient hardware…

Machine Learning · Computer Science 2025-05-21 Junye Jiang , Yaan Zhou , Yuanhao Gong , Haoxuan Yuan , Shuanglong Liu

The neural network and quantum computing are both significant and appealing fields, with their interactive disciplines promising for large-scale computing tasks that are untackled by conventional computers. However, both developments are…

Quantum Physics · Physics 2021-06-22 Feihong Shen , Jun Liu

Recent research in deep learning (DL) has investigated the use of the Fast Fourier Transform (FFT) to accelerate the computations involved in Convolutional Neural Networks (CNNs) by replacing spatial convolution with element-wise…

Computer Vision and Pattern Recognition · Computer Science 2024-06-05 Eduardo Reis , Thangarajah Akilan , Mohammed Khalid

Convolutional neural network (CNN) is an important deep learning method. The convolution operation takes a large proportion of the total execution time for CNN. Feature maps for convolution operation are usually sparse. Multiplications and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-01 Weizhi Xu , Yintai Sun , fhengyu Fan , Hui Yu , Xin Fu

In recent years, Convolutional Neural Networks (ConvNets) have become an enabling technology for a wide range of novel embedded Artificial Intelligence systems. Across the range of applications, the performance needs vary significantly,…

Computer Vision and Pattern Recognition · Computer Science 2017-11-27 Stylianos I. Venieris , Christos-Savvas Bouganis

We present a new library for parallel distributed Fast Fourier Transforms (FFT). The importance of FFT in science and engineering and the advances in high performance computing necessitate further improvements. AccFFT extends existing FFT…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-05-27 Amir Gholami , Judith Hill , Dhairya Malhotra , George Biros

Conventional 3D convolutional neural networks (CNNs) are computationally expensive, memory intensive, prone to overfitting, and most importantly, there is a need to improve their feature learning capabilities. To address these issues, we…

Computer Vision and Pattern Recognition · Computer Science 2021-05-05 Sudhakar Kumawat , Manisha Verma , Yuta Nakashima , Shanmuganathan Raman

Deep learning has delivered its powerfulness in many application domains, especially in image and speech recognition. As the backbone of deep learning, deep neural networks (DNNs) consist of multiple layers of various types with hundreds to…

Machine Learning · Computer Science 2017-12-14 Sheng Lin , Ning Liu , Mahdi Nazemi , Hongjia Li , Caiwen Ding , Yanzhi Wang , Massoud Pedram
‹ Prev 1 2 3 10 Next ›