Related papers: Predefined Sparseness in Recurrent Sequence Models

Compressing Neural Language Models by Sparse Word Representations

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding…

Computation and Language · Computer Science 2016-10-14 Yunchuan Chen , Lili Mou , Yan Xu , Ge Li , Zhi Jin

Unsupervised Pretraining Encourages Moderate-Sparseness

It is well known that direct training of deep neural networks will generally lead to poor results. A major progress in recent years is the invention of various pretraining methods to initialize network parameters and it was shown that such…

Machine Learning · Computer Science 2014-06-10 Jun Li , Wei Luo , Jian Yang , Xiaotong Yuan

Exploring Sparsity in Recurrent Neural Networks

Recurrent Neural Networks (RNN) are widely used to solve a variety of problems and as the quantity of data and the amount of available compute have increased, so have model sizes. The number of parameters in recent state-of-the-art networks…

Machine Learning · Computer Science 2017-11-08 Sharan Narang , Erich Elsen , Gregory Diamos , Shubho Sengupta

Experiments on Properties of Hidden Structures of Sparse Neural Networks

Sparsity in the structure of Neural Networks can lead to less energy consumption, less memory usage, faster computation times on convenient hardware, and automated machine learning. If sparsity gives rise to certain kinds of structure, it…

Machine Learning · Computer Science 2021-07-28 Julian Stier , Harshil Darji , Michael Granitzer

Layer Sparsity in Neural Networks

Sparsity has become popular in machine learning, because it can save computational resources, facilitate interpretations, and prevent overfitting. In this paper, we discuss sparsity in the framework of neural networks. In particular, we…

Machine Learning · Computer Science 2020-06-30 Mohamed Hebiri , Johannes Lederer

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Obtaining versions of deep neural networks that are both highly-accurate and highly-sparse is one of the main challenges in the area of model compression, and several high-performance pruning techniques have been investigated by the…

Machine Learning · Computer Science 2023-09-11 Denis Kuznedelev , Eldar Kurtic , Eugenia Iofinova , Elias Frantar , Alexandra Peste , Dan Alistarh

Sparse Deep Learning Models with the $\ell_1$ Regularization

Sparse neural networks are highly desirable in deep learning in reducing its complexity. The goal of this paper is to study how choices of regularization parameters influence the sparsity level of learned neural networks. We first derive…

Machine Learning · Computer Science 2024-08-07 Lixin Shen , Rui Wang , Yuesheng Xu , Mingsong Yan

Sparsity Induction for Accurate Post-Training Pruning of Large Language Models

Large language models have demonstrated capabilities in text generation, while their increasing parameter scales present challenges in computational and memory efficiency. Post-training sparsity (PTS), which reduces model cost by removing…

Computation and Language · Computer Science 2026-02-26 Minhao Jiang , Zhikai Li , Xuewen Liu , Jing Zhang , Mengjuan Chen , Qingyi Gu

Learning Sparse Latent Representations for Generator Model

Sparsity is a desirable attribute. It can lead to more efficient and more effective representations compared to the dense model. Meanwhile, learning sparse latent representations has been a challenging problem in the field of computer…

Computer Vision and Pattern Recognition · Computer Science 2022-09-22 Hanao Li , Tian Han

Sparsity Emerges Naturally in Neural Language Models

Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks. If sparsity is important for NLP, might well-trained neural models naturally…

Computation and Language · Computer Science 2019-08-07 Naomi Saphra , Adam Lopez

NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models

Transformer-based Language Models have become ubiquitous in Natural Language Processing (NLP) due to their impressive performance on various tasks. However, expensive training as well as inference remains a significant impediment to their…

Machine Learning · Computer Science 2024-06-06 Amit Dhurandhar , Tejaswini Pedapati , Ronny Luss , Soham Dan , Aurelie Lozano , Payel Das , Georgios Kollias

Sparse Learning for Variable Selection with Structures and Nonlinearities

In this thesis we discuss machine learning methods performing automated variable selection for learning sparse predictive models. There are multiple reasons for promoting sparsity in the predictive models. By relying on a limited set of…

Machine Learning · Computer Science 2019-03-27 Magda Gregorova

Block-Sparse Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are used in state-of-the-art models in domains such as speech recognition, machine translation, and language modelling. Sparsity is a technique to reduce compute and memory requirements of deep learning…

Machine Learning · Computer Science 2017-11-09 Sharan Narang , Eric Undersander , Gregory Diamos

Meta-Learning Sparse Implicit Neural Representations

Implicit neural representations are a promising new avenue of representing general signals by learning a continuous function that, parameterized as a neural network, maps the domain of a signal to its codomain; the mapping from spatial…

Machine Learning · Computer Science 2021-11-09 Jaeho Lee , Jihoon Tack , Namhoon Lee , Jinwoo Shin

Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory

The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be…

Machine Learning · Computer Science 2020-07-01 Antonio Carta , Alessandro Sperduti , Davide Bacciu

Training Neural Networks with Fixed Sparse Masks

During typical gradient-based training of deep neural networks, all of the model's parameters are updated at each iteration. Recent work has shown that it is possible to update only a small subset of the model's parameters during training,…

Machine Learning · Computer Science 2021-11-19 Yi-Lin Sung , Varun Nair , Colin Raffel

Top-KAST: Top-K Always Sparse Training

Sparse neural networks are becoming increasingly important as the field seeks to improve the performance of existing models by scaling them up, while simultaneously trying to reduce power consumption and computational footprint.…

Machine Learning · Computer Science 2021-06-08 Siddhant M. Jayakumar , Razvan Pascanu , Jack W. Rae , Simon Osindero , Erich Elsen

On Implicit Filter Level Sparsity in Convolutional Neural Networks

We investigate filter level sparsity that emerges in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay.…

Machine Learning · Computer Science 2019-04-08 Dushyant Mehta , Kwang In Kim , Christian Theobalt

Selfish Sparse RNN Training

Sparse neural networks have been widely applied to reduce the computational demands of training and deploying over-parameterized deep neural networks. For inference acceleration, methods that discover a sparse network from a pre-trained…

Machine Learning · Computer Science 2021-06-16 Shiwei Liu , Decebal Constantin Mocanu , Yulong Pei , Mykola Pechenizkiy

EduBERT: Pretrained Deep Language Models for Learning Analytics

The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to…

Computers and Society · Computer Science 2019-12-03 Benjamin Clavié , Kobi Gal