English
Related papers

Related papers: At-Scale Sparse Deep Neural Network Inference with…

200 papers

Parallel training of neural networks at scale is challenging due to significant overheads arising from communication. Recently, deep learning researchers have developed a variety of pruning algorithms that are capable of pruning (i.e.…

Machine Learning · Computer Science 2023-05-16 Siddharth Singh , Abhinav Bhatele

Scientific workloads have traditionally exploited high levels of sparsity to accelerate computation and reduce memory requirements. While deep neural networks can be made sparse, achieving practical speedups on GPUs is difficult because…

Machine Learning · Computer Science 2020-09-02 Trevor Gale , Matei Zaharia , Cliff Young , Erich Elsen

Graph neural networks (GNNs), an emerging deep learning model class, can extract meaningful representations from highly expressive graph-structured data and are therefore gaining popularity for wider ranges of applications. However, current…

Machine Learning · Computer Science 2021-04-27 Chien-Yu Lin , Liang Luo , Luis Ceze

Recurrent Neural Networks (RNNs) are powerful tools for solving sequence-based problems, but their efficacy and execution time are dependent on the size of the network. Following recent work in simplifying these networks with model pruning…

Neural and Evolutionary Computing · Computer Science 2018-04-30 Feiwen Zhu , Jeff Pool , Michael Andersch , Jeremy Appleyard , Fung Xie

The computational demands of modern Deep Neural Networks (DNNs) are immense and constantly growing. While training costs usually capture public attention, inference demands are also contributing in significant computational, energy and…

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference…

Computer Vision and Pattern Recognition · Computer Science 2020-10-30 Zhuliang Yao , Shijie Cao , Wencong Xiao , Chen Zhang , Lanshun Nie

Contemporary Deep Neural Network (DNN) contains millions of synaptic connections with tens to hundreds of layers. The large computation and memory requirements pose a challenge to the hardware design. In this work, we leverage the intrinsic…

Machine Learning · Computer Science 2017-11-07 Jingyang Zhu , Jingbo Jiang , Xizi Chen , Chi-Ying Tsui

Sparse matrix-vector and matrix-matrix multiplication (SpMV and SpMM) are fundamental in both conventional (graph analytics, scientific computing) and emerging (sparse DNN, GNN) domains. Workload-balancing and parallel-reduction are…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-15 Guyue Huang , Guohao Dai , Yu Wang , Yufei Ding , Yuan Xie

Graph Convolutional Networks (GCNs) are recently getting much attention in bioinformatics and chemoinformatics as a state-of-the-art machine learning approach with high accuracy. GCNs process convolutional operations along with graph…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-28 Yusuke Nagasaka , Akira Nukada , Ryosuke Kojima , Satoshi Matsuoka

The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective…

Machine Learning · Computer Science 2021-04-27 Gunduz Vehbi Demirci , Hakan Ferhatosmanoglu

We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row (CSR) format and thus do not require expensive format conversion.…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-13 Carl Yang , Aydin Buluc , John D. Owens

The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The proposed Sparse Deep Neural Network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-03 Jeremy Kepner , Simon Alford , Vijay Gadepally , Michael Jones , Lauren Milechin , Ryan Robinett , Sid Samsi

The last few years have seen gigantic leaps in algorithms and systems to support efficient deep learning inference. Pruning and quantization algorithms can now consistently compress neural networks by an order of magnitude. For a compressed…

Machine Learning · Computer Science 2021-07-22 Ziheng Wang

Graph Neural Networks (GNNs) have achieved significant improvements in various domains. Sparse Matrix-Matrix multiplication (SpMM) is a fundamental operator in GNNs, which performs a multiplication between a sparse matrix and a dense…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-08 Guyue Huang , Guohao Dai , Yu Wang , Huazhong Yang

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational…

Neural and Evolutionary Computing · Computer Science 2017-11-07 Sourya Dey , Yinan Shao , Keith M. Chugg , Peter A. Beerel

The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The Sparse Deep Neural Network (DNN)…

Machine Learning · Computer Science 2020-12-24 Jeremy Kepner , Simon Alford , Vijay Gadepally , Michael Jones , Lauren Milechin , Albert Reuther , Ryan Robinett , Sid Samsi

Sparse deep neural networks(DNNs) are efficient in both memory and compute when compared to dense DNNs. But due to irregularity in computation of sparse DNNs, their efficiencies are much lower than that of dense DNNs on regular parallel…

Machine Learning · Computer Science 2018-12-31 Dharma Teja Vooturi , Dheevatsa Mudigere , Sasikanth Avancha

Personalized recommendation is a ubiquitous application on the internet, with many industries and hyperscalers extensively leveraging Deep Learning Recommendation Models (DLRMs) for their personalization needs (like ad serving or movie…

Hardware Architecture · Computer Science 2024-10-30 Rishabh Jain , Vivek M. Bhasi , Adwait Jog , Anand Sivasubramaniam , Mahmut T. Kandemir , Chita R. Das

Fueled by the ability to mine real-world graph data, GNN applications have experienced phenomenal growth. Sparse Matrix-Matrix Multiplication (SpMM) is a critical operator in GNNs. However, existing SpMM designs for GNNs struggle to adapt…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-18 Lixing Zhang , Guanhua Ye , Hongzheng Li , Shigang Li , Yingxia Shao

In this paper, we use graphics processing units(GPU) to accelerate sparse and arbitrary structured neural networks. Sparse networks have nodes in the network that are not fully connected with nodes in preceding and following layers, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-12 Aavaas Gajurel , Sushil J. Louis , Frederick C Harris
‹ Prev 1 2 3 10 Next ›