English
Related papers

Related papers: Backpack Language Models

200 papers

The Backpack is a Transformer alternative shown to improve interpretability in English language modeling by decomposing predictions into a weighted sum of token sense components. However, Backpacks' reliance on token-defined meaning raises…

Computation and Language · Computer Science 2023-10-20 Hao Sun , John Hewitt

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community. Recent interpretability methods project weights and hidden states obtained from the forward pass to the…

Computation and Language · Computer Science 2024-02-21 Shahar Katz , Yonatan Belinkov , Mor Geva , Lior Wolf

Language models based on the Transformer architecture achieve excellent results in many language-related tasks, such as text classification or sentiment analysis. However, despite the architecture of these models being well-defined, little…

Computation and Language · Computer Science 2025-04-14 Miguel López-Otal , Jorge Gracia , Jordi Bernad , Carlos Bobed , Lucía Pitarch-Ballesteros , Emma Anglés-Herrero

The presence of social biases in large language models (LLMs) has become a significant concern in AI research. These biases, often embedded in training data, can perpetuate harmful stereotypes and distort decision-making processes. When…

Information Retrieval · Computer Science 2025-11-04 Amirabbas Afzali , Amirreza Velae , Iman Ahmadi , Mohammad Aliannejadi

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be…

Computation and Language · Computer Science 2017-06-22 Massimiliano Mancini , Jose Camacho-Collados , Ignacio Iacobacci , Roberto Navigli

Distributional semantics based on neural approaches is a cornerstone of Natural Language Processing, with surprising connections to human meaning representation as well. Recent Transformer-based Language Models have proven capable of…

Computation and Language · Computer Science 2022-04-04 Daniel Loureiro , Alípio Mário Jorge , Jose Camacho-Collados

Although it is known that transformer language models (LMs) pass features from early layers to later layers, it is not well understood how this information is represented and routed by the model. We analyze a mechanism used in two LMs to…

Computation and Language · Computer Science 2025-05-12 Jack Merullo , Carsten Eickhoff , Ellie Pavlick

We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words. Since word-level information provides a crucial source of bias, our input model composes representations…

Computation and Language · Computer Science 2015-11-17 Wang Ling , Isabel Trancoso , Chris Dyer , Alan W Black

We approach the problem of generalizing pre-trained word embeddings beyond fixed-size vocabularies without using additional contextual information. We propose a subword-level word vector generation model that views words as bags of…

Computation and Language · Computer Science 2018-09-13 Jinman Zhao , Sidharth Mudgal , Yingyu Liang

We approach multilinguality as sense adaptation: aligning latent meaning representations across languages rather than relying solely on shared parameters and scale. In this paper, we introduce SENse-based Symmetric Interlingual Alignment…

Computation and Language · Computer Science 2026-01-16 Jan Christian Blaise Cruz , David Ifeoluwa Adelani , Alham Fikri Aji

Transformer has demonstrated its great power to learn contextual word representations for multiple languages in a single model. To process multilingual sentences in the model, a learnable vector is usually assigned to each language, which…

Computation and Language · Computer Science 2021-02-17 Shengjie Luo , Kaiyuan Gao , Shuxin Zheng , Guolin Ke , Di He , Liwei Wang , Tie-Yan Liu

Neural word representations have proven useful in Natural Language Processing (NLP) tasks due to their ability to efficiently model complex semantic and syntactic word relationships. However, most techniques model only one representation…

Computation and Language · Computer Science 2015-11-23 Andrew Trask , Phil Michalak , John Liu

Vector representation of sentences is important for many text processing tasks that involve clustering, classifying, or ranking sentences. Recently, distributed representation of sentences learned by neural models from unlabeled data has…

Computation and Language · Computer Science 2016-10-27 Tanay Kumar Saha , Shafiq Joty , Naeemul Hassan , Mohammad Al Hasan

Neural language models are widely used; however, their model parameters often need to be adapted to the specific domains and tasks of an application, which is time- and resource-consuming. Thus, adapters have recently been introduced as a…

Artificial Intelligence · Computer Science 2022-08-18 Rita Sevastjanova , Eren Cakmak , Shauli Ravfogel , Ryan Cotterell , Mennatallah El-Assady

Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such…

Computation and Language · Computer Science 2020-11-20 John Wieting , Graham Neubig , Taylor Berg-Kirkpatrick

The neural architectures of language models are becoming increasingly complex, especially that of Transformers, based on the attention mechanism. Although their application to numerous natural language processing tasks has proven to be very…

Computation and Language · Computer Science 2023-12-04 Pablo Gamallo

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights. We instead envision language models that can simply read and memorize new data at inference time, thus…

Machine Learning · Computer Science 2022-03-18 Yuhuai Wu , Markus N. Rabe , DeLesley Hutchins , Christian Szegedy

Word embeddings predict a word from its neighbours by learning small, dense embedding vectors. In practice, this prediction corresponds to a semantic score given to the predicted word (or term weight). We present a novel model that, given a…

Information Retrieval · Computer Science 2019-06-04 Casper Hansen , Christian Hansen , Stephen Alstrup , Jakob Grue Simonsen , Christina Lioma

Natural languages are believed to be (mildly) context-sensitive. Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in…

Computation and Language · Computer Science 2024-05-15 Jiaoda Li , Jennifer C. White , Mrinmaya Sachan , Ryan Cotterell

Learning vectors that capture the meaning of concepts remains a fundamental challenge. Somewhat surprisingly, perhaps, pre-trained language models have thus far only enabled modest improvements to the quality of such concept embeddings.…

Computation and Language · Computer Science 2023-05-18 Na Li , Hanane Kteich , Zied Bouraoui , Steven Schockaert
‹ Prev 1 2 3 10 Next ›