English
Related papers

Related papers: GrateTile: Efficient Sparse Tensor Tiling for CNN …

200 papers

Spectral-domain CNNs have been shown to be more efficient than traditional spatial CNNs in terms of reducing computation complexity. However they come with a `kernel explosion' problem that, even after compression (pruning), imposes a high…

Hardware Architecture · Computer Science 2023-10-18 Yue Niu , Rajgopal Kannan , Ajitesh Srivastava , Viktor Prasanna

Network pruning can reduce the high computation cost of deep neural network (DNN) models. However, to maintain their accuracies, sparse models often carry randomly-distributed weights, leading to irregular computations. Consequently, sparse…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-01 Cong Guo , Bo Yang Hsueh , Jingwen Leng , Yuxian Qiu , Yue Guan , Zehuan Wang , Xiaoying Jia , Xipeng Li , Minyi Guo , Yuhao Zhu

Network pruning can reduce the computation cost of deep neural network (DNN) models. However, sparse models often produce randomly-distributed weights to maintain accuracy, leading to irregular computations. Consequently, unstructured…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-19 Cong Guo , Fengchen Xue , Jingwen Leng , Yuxian Qiu , Yue Guan , Weihao Cui , Quan Chen , Minyi Guo

To accelerate deep CNN models, this paper proposes a novel spatially adaptive framework that can dynamically generate pixel-wise sparsity according to the input image. The sparse scheme is pixel-wise refined, regional adaptive under a…

Computer Vision and Pattern Recognition · Computer Science 2021-03-23 Chen Tang , Wenyu Sun , Zhuqing Yuan , Yongpan Liu

Conventional deep convolutional neural networks (CNNs) apply convolution operators uniformly in space across all feature maps for hundreds of layers - this incurs a high computational cost for real-time applications. For many problems such…

Computer Vision and Pattern Recognition · Computer Science 2018-06-08 Mengye Ren , Andrei Pokrovsky , Bin Yang , Raquel Urtasun

Graph convolutional networks (GCNs) are becoming increasingly popular as they can process a wide variety of data formats that prior deep neural networks cannot easily support. One key challenge in designing hardware accelerators for GCNs is…

Machine Learning · Computer Science 2023-01-25 Mingi Yoo , Jaeyong Song , Hyeyoon Lee , Jounghoo Lee , Namhyung Kim , Youngsok Kim , Jinho Lee

To achieve higher accuracy in machine learning tasks, very deep convolutional neural networks (CNNs) are designed recently. However, the large memory access of deep CNNs will lead to high power consumption. A variety of hardware-friendly…

Image and Video Processing · Electrical Eng. & Systems 2021-06-25 Yubo Shi , Meiqi Wang , Siyi Chen , Jinghe Wei , Zhongfeng Wang

Sparsity is an intrinsic property of convolutional neural network(CNN) and worth exploiting for CNN accelerators, but extra processing comes with hardware overhead, causing many architectures suffering from only minor profit. Meanwhile,…

Hardware Architecture · Computer Science 2022-09-26 Wenhao Sun , Deng Liu , Zhiwei Zou , Wendi Sun , Yi Kang , Song Chen

We present both a novel Convolutional Neural Network (CNN) accelerator architecture and a network compiler for FPGAs that outperforms all prior work. Instead of having generic processing elements that together process one layer at a time,…

Hardware Architecture · Computer Science 2020-07-22 Mathew Hall , Vaughn Betz

Training Convolutional Neural Networks (CNNs) usually requires a large number of computational resources. In this paper, \textit{SparseTrain} is proposed to accelerate CNN training by fully exploiting the sparsity. It mainly involves three…

Computer Vision and Pattern Recognition · Computer Science 2020-07-28 Pengcheng Dai , Jianlei Yang , Xucheng Ye , Xingzhou Cheng , Junyu Luo , Linghao Song , Yiran Chen , Weisheng Zhao

Compressing convolutional neural networks (CNNs) has received ever-increasing research focus. However, most existing CNN compression methods do not interpret their inherent structures to distinguish the implicit redundancy. In this paper,…

Computer Vision and Pattern Recognition · Computer Science 2019-04-02 Yuchao Li , Shaohui Lin , Baochang Zhang , Jianzhuang Liu , David Doermann , Yongjian Wu , Feiyue Huang , Rongrong Ji

Deep Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in a wide range of applications. However, deeper CNN models, which are usually computation consuming, are widely required for complex Artificial…

Systems and Control · Electrical Eng. & Systems 2020-01-08 Chaoyang Zhu , Kejie Huang , Shuyuan Yang , Ziqi Zhu , Hejia Zhang , Haibin Shen

Existing deep convolutional neural networks (CNNs) generate massive interlayer feature data during network inference. To maintain real-time processing in embedded systems, large on-chip memory is required to buffer the interlayer feature…

Hardware Architecture · Computer Science 2021-10-13 Zhuang Shao , Xiaoliang Chen , Li Du , Lei Chen , Yuan Du , Wei Zhuang , Huadong Wei , Chenjia Xie , Zhongfeng Wang

CNNs outperform traditional machine learning algorithms across a wide range of applications. However, their computational complexity makes it necessary to design efficient hardware accelerators. Most CNN accelerators focus on exploring…

Hardware Architecture · Computer Science 2020-06-25 Ye Yu , Niraj K. Jha

Inference of standard convolutional neural networks (CNNs) on FPGAs often incurs high latency and a long initiation interval due to the deep nested loops required to densely convolve every input pixel regardless of its feature value.…

Hardware Architecture · Computer Science 2025-12-16 Ho Fung Tsoi , Dylan Rankin , Vladimir Loncar , Philip Harris

Convolutional neural network (CNN) inference on mobile devices demands efficient hardware acceleration of low-precision (INT8) general matrix multiplication (GEMM). Exploiting data sparsity is a common approach to further accelerate GEMM…

Hardware Architecture · Computer Science 2020-10-14 Zhi-Gang Liu , Paul N. Whatmough , Matthew Mattina

Phenomenally successful in practical inference problems, convolutional neural networks (CNN) are widely deployed in mobile devices, data centers, and even supercomputers. The number of parameters needed in CNNs, however, are often large and…

Computer Vision and Pattern Recognition · Computer Science 2017-08-01 Jongsoo Park , Sheng Li , Wei Wen , Ping Tak Peter Tang , Hai Li , Yiran Chen , Pradeep Dubey

Convolutional Neural Networks (CNNs) have emerged as a fundamental technology for machine learning. High performance and extreme energy efficiency are critical for deployments of CNNs in a wide range of situations, especially mobile…

Streaming coarse-grained reconfgurable array (CGRA) is a promising architecture for data/computing-intensive applications because of its fexibility, high throughput and efcient memory system. However,when accelerating sparse CNNs, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-17 Xiaobing Ni , Mengke Ge , Jiaheng Ruan , Song Chen , Yi Kang

The design complexity of CNNs has been steadily increasing to improve accuracy. To cope with the massive amount of computation needed for such complex CNNs, the latest solutions utilize blocking of an image over the available dimensions and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-19 Daejin Jung , Sunjung Lee , Wonjong Rhee , Jung Ho Ahn
‹ Prev 1 2 3 10 Next ›