Related papers: Dynamic Sparse Training with Structured Sparsity

Dynamic Sparse Training of Diagonally Sparse Networks

Recent advances in Dynamic Sparse Training (DST) have pushed the frontier of sparse neural network training in structured and unstructured contexts, matching dense-model performance while drastically reducing parameter counts to facilitate…

Machine Learning · Computer Science 2025-06-16 Abhishek Tyagi , Arjun Iyer , William H Renninger , Christopher Kanan , Yuhao Zhu

Efficient Dynamic Structured Sparse Training with Learned Shuffles

Structured sparsity accelerates training and inference on modern GPUs, yet it still trails unstructured dynamic sparse training (DST) in accuracy. The shortfall stems from a loss of expressivity: whereas a dense layer can realize every…

Machine Learning · Computer Science 2025-10-17 Abhishek Tyagi , Arjun Iyer , Liam Young , William H Renninger , Christopher Kanan , Yuhao Zhu

Dynamic Sparsity Is Channel-Level Sparsity Learner

Sparse training has received an upsurging interest in machine learning due to its tantalizing saving potential for the entire training process as well as inference. Dynamic sparse training (DST), as a leading sparse training approach, can…

Machine Learning · Computer Science 2023-11-13 Lu Yin , Gen Li , Meng Fang , Li Shen , Tianjin Huang , Zhangyang Wang , Vlado Menkovski , Xiaolong Ma , Mykola Pechenizkiy , Shiwei Liu

Selfish Sparse RNN Training

Sparse neural networks have been widely applied to reduce the computational demands of training and deploying over-parameterized deep neural networks. For inference acceleration, methods that discover a sparse network from a pre-trained…

Machine Learning · Computer Science 2021-06-16 Shiwei Liu , Decebal Constantin Mocanu , Yulong Pei , Mykola Pechenizkiy

RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch

Sparsity is a well-studied technique for compressing deep neural networks (DNNs) without compromising performance. In deep reinforcement learning (DRL), neural networks with up to 5% of their original weights can still be trained with…

Machine Learning · Computer Science 2026-02-17 Isam Vrce , Andreas Kassler , Gökçe Aydos

Dynamic Sparse Training for Deep Reinforcement Learning

Deep reinforcement learning (DRL) agents are trained through trial-and-error interactions with the environment. This leads to a long training time for dense neural networks to achieve good performance. Hence, prohibitive computation and…

Machine Learning · Computer Science 2022-05-09 Ghada Sokar , Elena Mocanu , Decebal Constantin Mocanu , Mykola Pechenizkiy , Peter Stone

SparseOpt: Addressing Normalization-induced Gradient Skew in Sparse Training

Dynamic Sparse Training (DST) methods train neural networks by maintaining sparsity while dynamically adapting the network topology. Despite the promise of reduced computation, DST methods converge significantly slower than dense training,…

Machine Learning · Computer Science 2026-05-28 Mohammed Adnan , Rohan Jain , Tom Jacobs , Ekansh Sharma , Rahul G. Krishnan , Rebekka Burkholz , Yani Ioannou

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. It can be generally categorized into unstructured fine-grained sparsity that zeroes out multiple…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Aojun Zhou , Yukun Ma , Junnan Zhu , Jianbo Liu , Zhijie Zhang , Kun Yuan , Wenxiu Sun , Hongsheng Li

SparseTrain:Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors

Our community has greatly improved the efficiency of deep learning applications, including by exploiting sparsity in inputs. Most of that work, though, is for inference, where weight sparsity is known statically, and/or for specialized…

Machine Learning · Computer Science 2020-12-04 Zhangxiaowen Gong , Houxiang Ji , Christopher Fletcher , Christopher Hughes , Josep Torrellas

Navigating Extremes: Dynamic Sparsity in Large Output Spaces

In recent years, Dynamic Sparse Training (DST) has emerged as an alternative to post-training pruning for generating efficient models. In principle, DST allows for a more memory efficient training process, as it maintains sparsity…

Machine Learning · Computer Science 2025-02-11 Nasib Ullah , Erik Schultheis , Mike Lasby , Yani Ioannou , Rohit Babbar

Sparse Training of Neural Networks based on Multilevel Mirror Descent

We introduce a dynamic sparse training algorithm based on linearized Bregman iterations / mirror descent that exploits the naturally incurred sparsity by alternating between periods of static and dynamic sparsity pattern updates. The key…

Machine Learning · Computer Science 2026-05-19 Yannick Lunk , Sebastian J. Scott , Leon Bungert

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training

Exploiting sparsity enables hardware systems to run neural networks faster and more energy-efficiently. However, most prior sparsity-centric optimization techniques only accelerate the forward pass of neural networks and usually require an…

Machine Learning · Computer Science 2018-06-05 Maohua Zhu , Jason Clemons , Jeff Pool , Minsoo Rhu , Stephen W. Keckler , Yuan Xie

Rigging the Lottery: Making All Tickets Winners

Many applications require sparse neural networks due to space or inference time restrictions. There is a large body of work on training dense networks to yield sparse networks for inference, but this limits the size of the largest trainable…

Machine Learning · Computer Science 2021-07-26 Utku Evci , Trevor Gale , Jacob Menick , Pablo Samuel Castro , Erich Elsen

RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch

Training deep reinforcement learning (DRL) models usually requires high computation costs. Therefore, compressing DRL models possesses immense potential for training acceleration and model deployment. However, existing methods that generate…

Machine Learning · Computer Science 2023-03-09 Yiqin Tan , Pihe Hu , Ling Pan , Jiatai Huang , Longbo Huang

Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers

We present a novel network pruning algorithm called Dynamic Sparse Training that can jointly find the optimal network parameters and sparse network structure in a unified optimization process with trainable pruning thresholds. These…

Machine Learning · Computer Science 2020-05-15 Junjie Liu , Zhe Xu , Runbin Shi , Ray C. C. Cheung , Hayden K. H. So

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

Despite impressive performance, deep neural networks require significant memory and computation costs, prohibiting their application in resource-constrained scenarios. Sparse training is one of the most common techniques to reduce these…

Machine Learning · Computer Science 2023-12-06 Bowen Lei , Dongkuan Xu , Ruqi Zhang , Shuren He , Bani K. Mallick

Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning

Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL), where larger models often degrade performance due to unique optimization pathologies such as…

Machine Learning · Computer Science 2025-10-15 Guozheng Ma , Lu Li , Zilin Wang , Haoyu Wang , Shengchao Hu , Leszek Rutkowski , Dacheng Tao

Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning

Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse…

Computer Vision and Pattern Recognition · Computer Science 2025-12-02 Andy Li , Aiden Durrant , Milan Markovic , Tianjin Huang , Souvik Kundu , Tianlong Chen , Lu Yin , Georgios Leontidis

Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates

Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and retain knowledge from a stream of data with as little computational overhead as possible. To this end; regularization, replay, architecture,…

Machine Learning · Computer Science 2023-12-05 Murat Onur Yildirim , Elif Ceren Gok Yildirim , Ghada Sokar , Decebal Constantin Mocanu , Joaquin Vanschoren

Sparse Networks from Scratch: Faster Training without Losing Performance

We demonstrate the possibility of what we call sparse learning: accelerated training of deep neural networks that maintain sparse weights throughout training while achieving dense performance levels. We accomplish this by developing sparse…

Machine Learning · Computer Science 2019-08-27 Tim Dettmers , Luke Zettlemoyer