English
Related papers

Related papers: Efficient Dynamic Structured Sparse Training with …

200 papers

Dynamic Sparse Training (DST) methods achieve state-of-the-art results in sparse neural network training, matching the generalization of dense models while enabling sparse training and inference. Although the resulting models are highly…

Machine Learning · Computer Science 2024-02-23 Mike Lasby , Anna Golubeva , Utku Evci , Mihai Nica , Yani Ioannou

Recent advances in Dynamic Sparse Training (DST) have pushed the frontier of sparse neural network training in structured and unstructured contexts, matching dense-model performance while drastically reducing parameter counts to facilitate…

Machine Learning · Computer Science 2025-06-16 Abhishek Tyagi , Arjun Iyer , William H Renninger , Christopher Kanan , Yuhao Zhu

Sparse training has received an upsurging interest in machine learning due to its tantalizing saving potential for the entire training process as well as inference. Dynamic sparse training (DST), as a leading sparse training approach, can…

Machine Learning · Computer Science 2023-11-13 Lu Yin , Gen Li , Meng Fang , Li Shen , Tianjin Huang , Zhangyang Wang , Vlado Menkovski , Xiaolong Ma , Mykola Pechenizkiy , Shiwei Liu

In recent years, Dynamic Sparse Training (DST) has emerged as an alternative to post-training pruning for generating efficient models. In principle, DST allows for a more memory efficient training process, as it maintains sparsity…

Machine Learning · Computer Science 2025-02-11 Nasib Ullah , Erik Schultheis , Mike Lasby , Yani Ioannou , Rohit Babbar

Exploiting sparsity enables hardware systems to run neural networks faster and more energy-efficiently. However, most prior sparsity-centric optimization techniques only accelerate the forward pass of neural networks and usually require an…

Machine Learning · Computer Science 2018-06-05 Maohua Zhu , Jason Clemons , Jeff Pool , Minsoo Rhu , Stephen W. Keckler , Yuan Xie

Structured sparsity has emerged as a popular model pruning technique, widely adopted in various architectures, including CNNs, Transformer models, and especially large language models (LLMs) in recent years. A promising direction to further…

Machine Learning · Computer Science 2026-02-02 Zekai Li , Ji Liu , Guanchen Li , Yixing Xu , Ziqiong Liu , Xuanwu Yin , Dong Li , Emad Barsoum

The demand for efficient processing of deep neural networks (DNNs) on embedded devices is a significant challenge limiting their deployment. Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference…

Computer Vision and Pattern Recognition · Computer Science 2023-09-28 Matteo Grimaldi , Darshan C. Ganji , Ivan Lazarevich , Sudhakar Sah

Unstructured pruning reduces the memory footprint in deep neural networks (DNNs). Recently, researchers proposed different types of structural pruning intending to reduce also the computation complexity. In this work, we first suggest a new…

Artificial Intelligence · Computer Science 2021-10-22 Itay Hubara , Brian Chmiel , Moshe Island , Ron Banner , Seffi Naor , Daniel Soudry

Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. It can be generally categorized into unstructured fine-grained sparsity that zeroes out multiple…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Aojun Zhou , Yukun Ma , Junnan Zhu , Jianbo Liu , Zhijie Zhang , Kun Yuan , Wenxiu Sun , Hongsheng Li

The Transformer has been an indispensable staple in deep learning. However, for real-life applications, it is very challenging to deploy efficient Transformers due to immense parameters and operations of models. To relieve this burden,…

Hardware Architecture · Computer Science 2022-11-01 Chao Fang , Aojun Zhou , Zhongfeng Wang

We demonstrate the possibility of what we call sparse learning: accelerated training of deep neural networks that maintain sparse weights throughout training while achieving dense performance levels. We accomplish this by developing sparse…

Machine Learning · Computer Science 2019-08-27 Tim Dettmers , Luke Zettlemoyer

In recent years, there has been a flurry of research in deep neural network pruning and compression. Early approaches prune weights individually. However, it is difficult to take advantage of the resulting unstructured sparsity patterns on…

Machine Learning · Computer Science 2020-08-28 Ziheng Wang

Existing deep neural networks (DNNs) that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on either activation or weight conditioned convolution operations. However, such conditional…

Computer Vision and Pattern Recognition · Computer Science 2023-02-08 Souvik Kundu , Sairam Sundaresan , Sharath Nittur Sridhar , Shunlin Lu , Han Tang , Peter A. Beerel

We study the benefits of different sparse architectures for deep reinforcement learning. In particular, we focus on image-based domains where spatially-biased and fully-connected architectures are common. Using these and several other…

Machine Learning · Computer Science 2025-02-04 Fatima Davelouis , John D. Martin , Michael Bowling

Recent research has focused on weight sparsity in deep neural network training to reduce FLOPs, aiming for improved efficiency (test accuracy w.r.t training FLOPs). However, sparse weight training often compromises accuracy, requiring…

Machine Learning · Computer Science 2024-07-19 Vithursan Thangarasa , Shreyas Saxena , Abhay Gupta , Sean Lie

Dynamic sparsity, where the sparsity patterns are unknown until runtime, poses a significant challenge to deep learning. The state-of-the-art sparsity-aware deep learning solutions are restricted to pre-defined, static sparsity patterns due…

Sparsity-aware training is an effective approach for transforming large language models (LLMs) into hardware-friendly sparse patterns, thereby reducing latency and memory consumption during inference. In this paper, we propose Continuous…

Machine Learning · Computer Science 2025-10-01 Weiyu Huang , Yuezhou Hu , Jun Zhu , Jianfei Chen

Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands. With hundreds of billion parameters, transformer-based LLMs necessitate months of…

Machine Learning · Computer Science 2024-08-22 Pihe Hu , Shaolong Li , Longbo Huang

Overparameterized neural networks generalize well but are expensive to train. Ideally, one would like to reduce their computational cost while retaining their generalization benefits. Sparse model training is a simple and promising approach…

Machine Learning · Computer Science 2022-05-12 Tri Dao , Beidi Chen , Kaizhao Liang , Jiaming Yang , Zhao Song , Atri Rudra , Christopher Ré

Training deep neural networks (DNNs) is costly. Fortunately, Nvidia Ampere and Hopper GPUs can accelerate matrix multiplications twice as fast as a dense equivalent by implementing 2:4 sparsity. However, previous STE-based 2:4 pre-training…

Machine Learning · Computer Science 2024-12-30 Yuezhou Hu , Jun Zhu , Jianfei Chen
‹ Prev 1 2 3 10 Next ›