English
Related papers

Related papers: maxDNN: An Efficient Convolution Kernel for Deep L…

200 papers

The convolution computation is widely used in many fields, especially in CNNs. Because of the rapid growth of the training data in CNNs, GPUs have been used for the acceleration, and memory-efficient algorithms are focused because of thier…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-02 Qiong Chang , Masaki Onishi , Tsutomu Maruyama

Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and hence, these are largely used in…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-28 Marc Jordà , Pedro Valero-Lara , Antonio J. Peña

In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been…

Machine Learning · Computer Science 2017-05-09 Xinyu Zhang , Srinjoy Das , Ojash Neopane , Ken Kreutz-Delgado

Deep Learning (DL) applications are gaining momentum in the realm of Artificial Intelligence, particularly after GPUs have demonstrated remarkable skills for accelerating their challenging computational requirements. Within this context,…

Computer Vision and Pattern Recognition · Computer Science 2018-08-02 Francisco M. Castro , Nicolás Guil , Manuel J. Marín-Jiménez , Jesús Pérez-Serrano , Manuel Ujaldón

Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve state-of-the-art recognition accuracy. Due to the substantial compute and memory operations, however, they require significant execution time. The massive…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-13 Chao Li , Yi Yang , Min Feng , Srimat Chakradhar , Huiyang Zhou

Convolution is a fundamental operation in many applications, such as computer vision, natural language processing, image processing, etc. Recent successes of convolutional neural networks in various deep learning applications put even…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-31 Xiaoming Chen , Jianxu Chen , Danny Z. Chen , Xiaobo Sharon Hu

This paper is focused on the improvement the efficiency of the sparse convolutional neural networks (CNNs) layers on graphic processing units (GPU). The Nvidia deep neural network (cuDnn) library provides the most effective implementation…

Machine Learning · Computer Science 2022-01-03 Marcin Pietroń , Dominik Żurek

NVIDIA cuDNN is a low-level library that provides GPU kernels frequently used in deep learning. Specifically, cuDNN implements several equivalent convolution algorithms, whose performance and memory footprint may vary considerably,…

Machine Learning · Computer Science 2018-04-16 Yosuke Oyama , Tal Ben-Nun , Torsten Hoefler , Satoshi Matsuoka

Deep Convolutional Neural Networks (CNNs) are widely employed in modern computer vision algorithms, where the input image is convolved iteratively by many kernels to extract the knowledge behind it. However, with the depth of convolutional…

Computer Vision and Pattern Recognition · Computer Science 2018-04-11 Chih-Ting Liu , Yi-Heng Wu , Yu-Sheng Lin , Shao-Yi Chien

Deep convolutional neural networks (ConvNets) of 3-dimensional kernels allow joint modeling of spatiotemporal features. These networks have improved performance of video and volumetric image analysis, but have been limited in size due to…

Computer Vision and Pattern Recognition · Computer Science 2017-06-13 David Budden , Alexander Matveev , Shibani Santurkar , Shraman Ray Chaudhuri , Nir Shavit

In the acceleration of deep neural network training, the GPU has become the mainstream platform. GPUs face substantial challenges on GNNs, such as workload imbalance and memory access irregularities, leading to underutilized hardware.…

Machine Learning · Computer Science 2024-03-20 Hongwu Peng , Xi Xie , Kaustubh Shivdikar , MD Amit Hasan , Jiahui Zhao , Shaoyi Huang , Omer Khan , David Kaeli , Caiwen Ding

Sliding window convolutional networks (ConvNets) have become a popular approach to computer vision problems such as image segmentation, and object detection and localization. Here we consider the problem of inference, the application of a…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-21 Aleksandar Zlateski , Kisuk Lee , H. Sebastian Seung

Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being compute-intensive, CNN computations are mainly accelerated by GPUs with high…

Hardware Architecture · Computer Science 2016-11-09 Dong Wang , Jianjing An , Ke Xu

Convolution layers are prevalent in many classes of deep neural networks, including Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like image recognition, neural machine translation and speech…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-22 Evangelos Georganas , Sasikanth Avancha , Kunal Banerjee , Dhiraj Kalamkar , Greg Henry , Hans Pabst , Alexander Heinecke

Convolutional neural network (CNN) is an important deep learning method. The convolution operation takes a large proportion of the total execution time for CNN. Feature maps for convolution operation are usually sparse. Multiplications and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-01 Weizhi Xu , Yintai Sun , fhengyu Fan , Hui Yu , Xin Fu

Convolutional Neural Networks are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition, and segmentation. Recent research results demonstrate that…

Signal Processing · Electrical Eng. & Systems 2020-05-11 Marco Carreras , Gianfranco Deriu , Luigi Raffo , Luca Benini , Paolo Meloni

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim of this paper is to achieve ultra-high energy efficiency and performance for hardware implementations of deep neural…

Machine Learning · Computer Science 2018-02-20 Yanzhi Wang , Caiwen Ding , Zhe Li , Geng Yuan , Siyu Liao , Xiaolong Ma , Bo Yuan , Xuehai Qian , Jian Tang , Qinru Qiu , Xue Lin

Depth is a vital piece of information for autonomous vehicles to perceive obstacles. Due to the relatively low price and small size of monocular cameras, depth estimation from a single RGB image has attracted great interest in the research…

Robotics · Computer Science 2021-11-25 Xingshuai Dong , Matthew A. Garratt , Sreenatha G. Anavatti , Hussein A. Abbass

Time, cost, and energy efficiency are critical considerations in Deep-Learning (DL), particularly when processing long texts. Transformers, which represent the current state of the art, exhibit quadratic computational complexity relative to…

Computation and Language · Computer Science 2025-07-11 Fardin Rastakhiz

Three-dimensional deconvolution is widely used in many computer vision applications. However, most previous works have only focused on accelerating 2D deconvolutional neural networks (DCNNs) on FPGAs, while the acceleration of 3D DCNNs has…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-08 Deguang Wang , Junzhong Shen , Mei Wen , Chunyuan Zhang
‹ Prev 1 2 3 10 Next ›