Jeff Pool — Scifaro

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Large Language Models (LLMs) are distinguished by their massive parameter counts, which typically result in significant redundancy. This work introduces MaskLLM, a learnable pruning method that establishes Semi-structured (or ``N:M'')…

Artificial Intelligence · Computer Science 2024-12-10 Gongfan Fang , Hongxu Yin , Saurav Muralidharan , Greg Heinrich , Jeff Pool , Jan Kautz , Pavlo Molchanov , Xinchao Wang

Self-Supervised GAN Compression

Deep learning's success has led to larger and larger models to handle more and more complex tasks; trained models can contain millions of parameters. These large models are compute- and memory-intensive, which makes it a challenge to deploy…

Machine Learning · Computer Science 2023-05-19 Chong Yu , Jeff Pool

Accelerating Sparse Deep Neural Networks

As neural network model sizes have dramatically increased, so has the interest in various techniques to reduce their parameter counts and accelerate their execution. An active area of research in this field is sparsity - encouraging zero…

Machine Learning · Computer Science 2021-04-20 Asit Mishra , Jorge Albericio Latorre , Jeff Pool , Darko Stosic , Dusan Stosic , Ganesh Venkatesh , Chong Yu , Paulius Micikevicius

Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs

GPUs offer orders-of-magnitude higher memory bandwidth than traditional CPU-only systems. However, GPU device memory tends to be relatively small and the memory capacity can not be increased by the user. This paper describes Buddy…

Hardware Architecture · Computer Science 2019-04-17 Esha Choukse , Michael Sullivan , Mike O'Connor , Mattan Erez , Jeff Pool , David Nellans , Steve Keckler

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training

Exploiting sparsity enables hardware systems to run neural networks faster and more energy-efficiently. However, most prior sparsity-centric optimization techniques only accelerate the forward pass of neural networks and usually require an…

Machine Learning · Computer Science 2018-06-05 Maohua Zhu , Jason Clemons , Jeff Pool , Minsoo Rhu , Stephen W. Keckler , Yuan Xie

Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip

Recurrent Neural Networks (RNNs) are powerful tools for solving sequence-based problems, but their efficacy and execution time are dependent on the size of the network. Following recent work in simplifying these networks with model pruning…

Neural and Evolutionary Computing · Computer Science 2018-04-30 Feiwen Zhu , Jeff Pool , Michael Andersch , Jeremy Appleyard , Fung Xie

Efficient Sparse-Winograd Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are computationally intensive, which limits their application on mobile devices. Their energy is dominated by the number of multiplies needed to perform the convolutions. Winograd's minimal filtering…

Computer Vision and Pattern Recognition · Computer Science 2018-02-20 Xingyu Liu , Jeff Pool , Song Han , William J. Dally

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks

Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the…

Machine Learning · Computer Science 2017-06-06 Huizi Mao , Song Han , Jeff Pool , Wenshuo Li , Xingyu Liu , Yu Wang , William J. Dally

Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the…

Machine Learning · Computer Science 2017-05-05 Minsoo Rhu , Mike O'Connor , Niladrish Chatterjee , Jeff Pool , Stephen W. Keckler

DSD: Dense-Sparse-Dense Training for Deep Neural Networks

Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the…

Computer Vision and Pattern Recognition · Computer Science 2017-02-23 Song Han , Jeff Pool , Sharan Narang , Huizi Mao , Enhao Gong , Shijian Tang , Erich Elsen , Peter Vajda , Manohar Paluri , John Tran , Bryan Catanzaro , William J. Dally

Learning both Weights and Connections for Efficient Neural Networks

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the…

Neural and Evolutionary Computing · Computer Science 2015-11-03 Song Han , Jeff Pool , John Tran , William J. Dally