English
Related papers

Related papers: Tensorized Embedding Layers for Efficient Model Co…

200 papers

The transformer architecture has revolutionized Natural Language Processing (NLP) and other machine-learning tasks, due to its unprecedented accuracy. However, their extensive memory and parameter requirements often hinder their practical…

Computation and Language · Computer Science 2023-11-01 Subhadra Vadlamannati , Ryan Solgi

High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, this high dimensionality also introduces…

Computation and Language · Computer Science 2024-10-07 Mingxue Xu , Yao Lei Xu , Danilo P. Mandic

Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required…

Machine Learning · Computer Science 2015-12-22 Alexander Novikov , Dmitry Podoprikhin , Anton Osokin , Dmitry Vetrov

Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or…

Machine Learning · Computer Science 2018-11-05 Anish Acharya , Rahul Goel , Angeliki Metallinou , Inderjit Dhillon

Embedding layers in transformer-based NLP models typically account for the largest share of model parameters, scaling with vocabulary size but not yielding performance gains proportional to scale. We propose an alternative approach in which…

Computation and Language · Computer Science 2025-05-06 Henry Ndubuaku , Mouad Talhi

Small Language Models (SLMs, or on-device LMs) have significantly fewer parameters than Large Language Models (LLMs). They are typically deployed on low-end devices, like mobile phones and single-board computers. Unlike LLMs, which rely on…

Computation and Language · Computer Science 2025-06-17 Mingxue Xu , Yao Lei Xu , Danilo P. Mandic

Embedding matrices are key components in neural natural language processing (NLP) models that are responsible to provide numerical representations of input tokens.\footnote{In this paper words and subwords are referred to as \textit{tokens}…

Computation and Language · Computer Science 2022-04-19 Krtin Kumar , Peyman Passban , Mehdi Rezagholizadeh , Yiu Sing Lau , Qun Liu

In this paper we review basic and emerging models and associated algorithms for large-scale tensor networks, especially Tensor Train (TT) decompositions using novel mathematical and graphical representations. We discus the concept of…

Numerical Analysis · Computer Science 2014-08-25 Andrzej Cichocki

The Recurrent Neural Networks and their variants have shown promising performances in sequence modeling tasks such as Natural Language Processing. These models, however, turn out to be impractical and difficult to train when exposed to very…

Computer Vision and Pattern Recognition · Computer Science 2017-07-07 Yinchong Yang , Denis Krompass , Volker Tresp

We study the topmost weight matrix of neural network language models. We show that this matrix constitutes a valid word embedding. When training language models, we recommend tying the input embedding and this output embedding. We analyze…

Computation and Language · Computer Science 2017-02-22 Ofir Press , Lior Wolf

Word-embeddings are vital components of Natural Language Processing (NLP) models and have been extensively explored. However, they consume a lot of memory which poses a challenge for edge deployment. Embedding matrices, typically, contain…

Computation and Language · Computer Science 2020-11-12 Vasileios Lioutas , Ahmad Rashid , Krtin Kumar , Md Akmal Haidar , Mehdi Rezagholizadeh

The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters. However, due to deployment constraints in edge devices, there has been a rising interest in the…

Computation and Language · Computer Science 2021-08-04 Klaudia Bałazy , Mohammadreza Banaei , Rémi Lebret , Jacek Tabor , Karl Aberer

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding…

Computation and Language · Computer Science 2016-10-14 Yunchuan Chen , Lili Mou , Yan Xu , Ge Li , Zhi Jin

The memory capacity of embedding tables in deep learning recommendation models (DLRMs) is increasing dramatically from tens of GBs to TBs across the industry. Given the fast growth in DLRMs, novel solutions are urgently needed, in order to…

Machine Learning · Computer Science 2021-01-29 Chunxing Yin , Bilge Acun , Xing Liu , Carole-Jean Wu

Natural language processing (NLP) models often require a massive number of parameters for word embeddings, resulting in a large storage or memory footprint. Deploying neural NLP models to mobile devices requires compressing the word…

Computation and Language · Computer Science 2017-11-20 Raphael Shu , Hideki Nakayama

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with…

Machine Learning · Statistics 2019-04-09 Artem M. Grachev , Dmitry I. Ignatov , Andrey V. Savchenko

We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an $d$-dimensional space, such that n-grams that are the translation of each…

Machine Learning · Computer Science 2011-05-17 Etter Vincent

Modern neural networks have revolutionized the fields of computer vision (CV) and Natural Language Processing (NLP). They are widely used for solving complex CV tasks and NLP tasks such as image classification, image generation, and machine…

Machine Learning · Computer Science 2023-07-25 Xingyi Liu , Keshab K. Parhi

Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications.…

Computation and Language · Computer Science 2019-04-09 Artem M. Grachev , Dmitry I. Ignatov , Andrey V. Savchenko

Tensorizing a neural network involves reshaping some or all of its dense weight matrices into higher-order tensors and approximating them using low-rank tensor network decompositions. This technique has shown promise as a model compression…

Machine Learning · Computer Science 2025-05-27 Safa Hamreras , Sukhbinder Singh , Román Orús
‹ Prev 1 2 3 10 Next ›