Related papers: Efficient and Generic 1D Dilated Convolution Layer…

Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs

Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve state-of-the-art recognition accuracy. Due to the substantial compute and memory operations, however, they require significant execution time. The massive…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-13 Chao Li , Yi Yang , Min Feng , Srimat Chakradhar , Huiyang Zhou

Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure

Deep convolutional neural networks achieve remarkable visual recognition performance, at the cost of high computational complexity. In this paper, we have a new design of efficient convolutional layers based on three schemes. The 3D…

Computer Vision and Pattern Recognition · Computer Science 2017-01-25 Min Wang , Baoyuan Liu , Hassan Foroosh

Chain-NN: An Energy-Efficient 1D Chain Architecture for Accelerating Deep Convolutional Neural Networks

Deep convolutional neural networks (CNN) have shown their good performances in many computer vision tasks. However, the high computational complexity of CNN involves a huge amount of data movements between the computational processor core…

Hardware Architecture · Computer Science 2017-03-07 Shihao Wang , Dajiang Zhou , Xushen Han , Takeshi Yoshimura

Unified Kernel-Segregated Transpose Convolution Operation

The optimization of the transpose convolution layer for deep learning applications is achieved with the kernel segregation mechanism. However, kernel segregation has disadvantages, such as computing extra elements to obtain the output…

Machine Learning · Computer Science 2025-03-03 Vijay Srinivas Tida , Md Imran Hossen , Liqun Shan , Sai Venkatesh Chilukoti , Sonya Hsu , Xiali Hei

1D Convolutional Neural Networks and Applications: A Survey

During the last decade, Convolutional Neural Networks (CNNs) have become the de facto standard for various Computer Vision and Machine Learning operations. CNNs are feed-forward Artificial Neural Networks (ANNs) with alternating…

Signal Processing · Electrical Eng. & Systems 2019-05-10 Serkan Kiranyaz , Onur Avci , Osama Abdeljaber , Turker Ince , Moncef Gabbouj , Daniel J. Inman

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator

Deep convolution Neural Network (DCNN) has been widely used in computer vision tasks. However, for edge devices even inference has too large computational complexity and data access amount. The inference latency of state-of-the-art models…

Hardware Architecture · Computer Science 2025-09-09 Kuan-Ting Lin , Ching-Te Chiu , Jheng-Yi Chang , Shi-Zong Huang , Yu-Ting Li

Deep Anchored Convolutional Neural Networks

Convolutional Neural Networks (CNNs) have been proven to be extremely successful at solving computer vision tasks. State-of-the-art methods favor such deep network architectures for its accuracy performance, with the cost of having massive…

Computer Vision and Pattern Recognition · Computer Science 2019-04-23 Jiahui Huang , Kshitij Dwivedi , Gemma Roig

Accelerating convolutional neural network by exploiting sparsity on GPUs

Convolutional neural network (CNN) is an important deep learning method. The convolution operation takes a large proportion of the total execution time for CNN. Feature maps for convolution operation are usually sparse. Multiplications and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-01 Weizhi Xu , Yintai Sun , fhengyu Fan , Hui Yu , Xin Fu

Towards a General Purpose CNN for Long Range Dependencies in $N$D

The use of Convolutional Neural Networks (CNNs) is widespread in Deep Learning due to a range of desirable model properties which result in an efficient and effective machine learning framework. However, performant CNN architectures must be…

Machine Learning · Computer Science 2022-07-06 David W. Romero , David M. Knigge , Albert Gu , Erik J. Bekkers , Efstratios Gavves , Jakub M. Tomczak , Mark Hoogendoorn

RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications. However, current BCNNs are not able to…

Computer Vision and Pattern Recognition · Computer Science 2019-09-09 Chunlei Liu , Wenrui Ding , Xin Xia , Yuan Hu , Baochang Zhang , Jianzhuang Liu , Bohan Zhuang , Guodong Guo

Fast convolution kernels on pascal GPU with high memory efficiency

The convolution computation is widely used in many fields, especially in CNNs. Because of the rapid growth of the training data in CNNs, GPUs have been used for the acceleration, and memory-efficient algorithms are focused because of thier…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-02 Qiong Chang , Masaki Onishi , Tsutomu Maruyama

Accelerating Very Deep Convolutional Networks for Classification and Detection

This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community. Unlike previous methods that are designed for…

Computer Vision and Pattern Recognition · Computer Science 2015-11-19 Xiangyu Zhang , Jianhua Zou , Kaiming He , Jian Sun

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy but each image evaluation requires millions of floating point…

Computer Vision and Pattern Recognition · Computer Science 2024-03-15 Remi Denton , Wojciech Zaremba , Joan Bruna , Yann LeCun , Rob Fergus

Convolutional Dictionary Pair Learning Network for Image Representation Learning

Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are powerful image representation learning systems based on different mechanisms and principles, however whether we can seamlessly integrate them to improve the…

Computer Vision and Pattern Recognition · Computer Science 2020-01-16 Zhao Zhang , Yulin Sun , Yang Wang , Zhengjun Zha , Shuicheng Yan , Meng Wang

Anatomy Of High-Performance Deep Learning Convolutions On SIMD Architectures

Convolution layers are prevalent in many classes of deep neural networks, including Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like image recognition, neural machine translation and speech…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-22 Evangelos Georganas , Sasikanth Avancha , Kunal Banerjee , Dhiraj Kalamkar , Greg Henry , Hans Pabst , Alexander Heinecke

IC Networks: Remodeling the Basic Unit for Convolutional Neural Networks

Convolutional neural network (CNN) is a class of artificial neural networks widely used in computer vision tasks. Most CNNs achieve excellent performance by stacking certain types of basic units. In addition to increasing the depth and…

Computer Vision and Pattern Recognition · Computer Science 2021-02-09 Junyi An , Fengshan Liu , Jian Zhao , Furao Shen

CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy Architecture

Convolutional Neural Networks (CNNs) have proven to be extremely accurate for image recognition, even outperforming human recognition capability. When deployed on battery-powered mobile devices, efficient computer architectures are required…

Hardware Architecture · Computer Science 2020-10-05 Mehdi Ahmadi , Shervin Vakili , J. M. Pierre Langlois

Deep Tensor Convolution on Multicores

Deep convolutional neural networks (ConvNets) of 3-dimensional kernels allow joint modeling of spatiotemporal features. These networks have improved performance of video and volumetric image analysis, but have been limited in size due to…

Computer Vision and Pattern Recognition · Computer Science 2017-06-13 David Budden , Alexander Matveev , Shibani Santurkar , Shraman Ray Chaudhuri , Nir Shavit

When deep learning models on GPU can be accelerated by taking advantage of unstructured sparsity

This paper is focused on the improvement the efficiency of the sparse convolutional neural networks (CNNs) layers on graphic processing units (GPU). The Nvidia deep neural network (cuDnn) library provides the most effective implementation…

Machine Learning · Computer Science 2022-01-03 Marcin Pietroń , Dominik Żurek

Building Efficient CNNs Using Depthwise Convolutional Eigen-Filters (DeCEF)

Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable…

Machine Learning · Computer Science 2022-02-01 Yinan Yu , Samuel Scheidegger , Tomas McKelvey