English
Related papers

Related papers: MicroNet for Efficient Language Modeling

200 papers

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies. Our approach, called adaptive softmax, circumvents the linear dependency on the vocabulary size by exploiting the…

Computation and Language · Computer Science 2017-06-20 Edouard Grave , Armand Joulin , Moustapha Cissé , David Grangier , Hervé Jégou

We introduce adaptive input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity. There are several choices on how to factorize the input and…

Computation and Language · Computer Science 2019-02-26 Alexei Baevski , Michael Auli

We formulate language modeling as a matrix factorization problem, and show that the expressiveness of Softmax-based models (including the majority of neural language models) is limited by a Softmax bottleneck. Given that natural language is…

Computation and Language · Computer Science 2018-03-06 Zhilin Yang , Zihang Dai , Ruslan Salakhutdinov , William W. Cohen

Transformer-based language models have recently been at the forefront of active research in text generation. However, these models' advances come at the price of prohibitive training costs, with parameter counts in the billions and compute…

Computation and Language · Computer Science 2025-02-04 Gabriel Lindenmaier , Sean Papay , Sebastian Padó

For reasons such as privacy, there are use cases for language models at the edge. This has given rise to small language models targeted for deployment in resource-constrained devices where energy efficiency is critical. Spiking neural…

Neural and Evolutionary Computing · Computer Science 2026-01-05 Kaiwen Tang , Zhanglu Yan , Weng-Fai Wong

FullSubNet is our recently proposed real-time single-channel speech enhancement network that achieves outstanding performance on the Deep Noise Suppression (DNS) Challenge dataset. A number of variants of FullSubNet have been proposed, but…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-08 Xiang Hao , Xiaofei Li

The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to…

Computers and Society · Computer Science 2019-12-03 Benjamin Clavié , Kobi Gal

With the success of deep learning in various fields and the advent of numerous Internet of Things (IoT) devices, it is essential to lighten models suitable for low-power devices. In keeping with this trend, MicroNet Challenge, which is the…

Machine Learning · Computer Science 2021-03-04 Gihun Lee , Sangmin Bae , Jaehoon Oh , Se-Young Yun

Large language models typically employ vocabularies of over 100k tokens, which creates a major computational bottleneck at the final linear projection layer when performing speculative decoding. Current methods for vocabulary pruning depend…

Computation and Language · Computer Science 2026-05-27 Zhiyang Chen , Daliang Xu , Yinyuan Zhang , Chenghua Wang , Mengwei Xu , Yun Ma

Neural language models have been widely used in various NLP tasks, including machine translation, next word prediction and conversational agents. However, it is challenging to deploy these models on mobile devices due to their slow…

Machine Learning · Computer Science 2018-10-31 Patrick H. Chen , Si Si , Sanjiv Kumar , Yang Li , Cho-Jui Hsieh

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding…

Computation and Language · Computer Science 2016-10-14 Yunchuan Chen , Lili Mou , Yan Xu , Ge Li , Zhi Jin

Traditional neural word embeddings are usually dependent on a richer diversity of vocabulary. However, the language models recline to cover major vocabularies via the word embedding parameters, in particular, for multilingual language…

Computation and Language · Computer Science 2023-08-21 Amit Kumar Jaiswal , Haiming Liu

The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameter-efficient…

Computation and Language · Computer Science 2022-03-09 Zhengkun Zhang , Wenya Guo , Xiaojun Meng , Yasheng Wang , Yadao Wang , Xin Jiang , Qun Liu , Zhenglu Yang

Mixture of Softmaxes (MoS) has been shown to be effective at addressing the expressiveness limitation of Softmax-based models. Despite the known advantage, MoS is practically sealed by its large consumption of memory and computational time…

Computation and Language · Computer Science 2019-06-27 Xiang Kong , Qizhe Xie , Zihang Dai , Eduard Hovy

We present ComplexityNet, a streamlined language model designed for assessing task complexity. This model predicts the likelihood of accurate output by various language models, each with different capabilities. Our initial application of…

Computation and Language · Computer Science 2024-10-16 Henry Bae , Aghyad Deeb , Alex Fleury , Kehang Zhu

Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses. As a case study, a state-of-the-art neural language model usually consists of one or more…

Computation and Language · Computer Science 2018-06-20 Patrick H. Chen , Si Si , Yang Li , Ciprian Chelba , Cho-jui Hsieh

Language models are increasingly used not only as standalone predictors but also as components in larger inference systems, from test-time reasoning to multi-model collaboration. We study language model networks, where pre-trained language…

Artificial Intelligence · Computer Science 2026-05-14 Shiguang Wu , Yaqing Wang , Quanming Yao

Transformer-based Language Models have become ubiquitous in Natural Language Processing (NLP) due to their impressive performance on various tasks. However, expensive training as well as inference remains a significant impediment to their…

Machine Learning · Computer Science 2024-06-06 Amit Dhurandhar , Tejaswini Pedapati , Ronny Luss , Soham Dan , Aurelie Lozano , Payel Das , Georgios Kollias

Pre-trained large-scale language models have increasingly demonstrated high accuracy on many natural language processing (NLP) tasks. However, the limited weight storage and computational speed on hardware platforms have impeded the…

Computation and Language · Computer Science 2020-10-23 Wei Niu , Zhenglun Kong , Geng Yuan , Weiwen Jiang , Jiexiong Guan , Caiwen Ding , Pu Zhao , Sijia Liu , Bin Ren , Yanzhi Wang

Transformers have transformed the field of natural language processing. This performance is largely attributed to the use of stacked self-attention layers, each of which consists of matrix multiplies as well as softmax operations. As a…

Hardware Architecture · Computer Science 2021-03-18 Jacob R. Stevens , Rangharajan Venkatesan , Steve Dai , Brucek Khailany , Anand Raghunathan
‹ Prev 1 2 3 10 Next ›