Related papers: Deep contextualized word representations

Dynamic Contextualized Word Embeddings

Static word embeddings that represent words by a single vector cannot capture the variability of word meaning in different linguistic and extralinguistic contexts. Building on prior work on contextualized and dynamic word embeddings, we…

Computation and Language · Computer Science 2021-06-09 Valentin Hofmann , Janet B. Pierrehumbert , Hinrich Schütze

Dissecting Contextual Word Embeddings: Architecture and Representation

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions…

Computation and Language · Computer Science 2018-10-01 Matthew E. Peters , Mark Neumann , Luke Zettlemoyer , Wen-tau Yih

Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection

One of the long-standing challenges in lexical semantics consists in learning representations of words which reflect their semantic properties. The remarkable success of word embeddings for this purpose suggests that high-quality…

Computation and Language · Computer Science 2021-06-16 Yixiao Wang , Zied Bouraoui , Luis Espinosa Anke , Steven Schockaert

Learning to Represent Words in Context with Multilingual Supervision

We present a neural network architecture based on bidirectional LSTMs to compute representations of words in the sentential contexts. These context-sensitive word representations are suitable for, e.g., distinguishing different word senses…

Computation and Language · Computer Science 2015-11-23 Kazuya Kawakami , Chris Dyer

Accurate Word Representations with Universal Visual Guidance

Word representation is a fundamental component in neural language understanding models. Recently, pre-trained language models (PrLMs) offer a new performant method of contextualized word representations by leveraging the sequence-level…

Computation and Language · Computer Science 2021-01-01 Zhuosheng Zhang , Haojie Yu , Hai Zhao , Rui Wang , Masao Utiyama

Cross-topic distributional semantic representations via unsupervised mappings

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a…

Computation and Language · Computer Science 2019-04-12 Eleftheria Briakou , Nikos Athanasiou , Alexandros Potamianos

A Survey of Knowledge Enhanced Pre-trained Models

Pre-trained language models learn informative word representations on a large-scale text corpus through self-supervised learning, which has achieved promising performance in fields of natural language processing (NLP) after fine-tuning.…

Computation and Language · Computer Science 2023-10-31 Jian Yang , Xinyu Hu , Gang Xiao , Yulong Shen

Probing Context Localization of Polysemous Words in Pre-trained Language Model Sub-Layers

In the era of high performing Large Language Models, researchers have widely acknowledged that contextual word representations are one of the key drivers in achieving top performances in downstream tasks. In this work, we investigate the…

Computation and Language · Computer Science 2024-09-24 Soniya Vijayakumar , Josef van Genabith , Simon Ostermann

Learned in Translation: Contextualized Word Vectors

Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep…

Computation and Language · Computer Science 2018-06-21 Bryan McCann , James Bradbury , Caiming Xiong , Richard Socher

PolyLM: Learning about Polysemy through Language Modeling

To avoid the "meaning conflation deficiency" of word embeddings, a number of models have aimed to embed individual word senses. These methods at one time performed well on tasks such as word sense induction (WSI), but they have since been…

Computation and Language · Computer Science 2021-01-27 Alan Ansell , Felipe Bravo-Marquez , Bernhard Pfahringer

LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

Distributional semantics based on neural approaches is a cornerstone of Natural Language Processing, with surprising connections to human meaning representation as well. Recent Transformer-based Language Models have proven capable of…

Computation and Language · Computer Science 2022-04-04 Daniel Loureiro , Alípio Mário Jorge , Jose Camacho-Collados

Improving Topic Models with Latent Feature Word Representations

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks. In this paper, we extend two…

Computation and Language · Computer Science 2018-10-16 Dat Quoc Nguyen , Richard Billingsley , Lan Du , Mark Johnson

Universal Sentence Representation Learning with Conditional Masked Language Model

This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale unlabeled corpora. CMLM integrates sentence representation learning into MLM training by…

Computation and Language · Computer Science 2021-09-13 Ziyi Yang , Yinfei Yang , Daniel Cer , Jax Law , Eric Darve

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that…

Computation and Language · Computer Science 2019-06-25 Daniel Loureiro , Alipio Jorge

Lost in Context? On the Sense-wise Variance of Contextualized Word Embeddings

Contextualized word embeddings in language models have given much advance to NLP. Intuitively, sentential information is integrated into the representation of words, which can help model polysemy. However, context sensitivity also leads to…

Computation and Language · Computer Science 2022-08-23 Yile Wang , Yue Zhang

Where exactly does contextualization in a PLM happen?

Pre-trained Language Models (PLMs) have shown to be consistently successful in a plethora of NLP tasks due to their ability to learn contextualized representations of words (Ethayarajh, 2019). BERT (Devlin et al., 2018), ELMo (Peters et…

Computation and Language · Computer Science 2023-12-12 Soniya Vijayakumar , Tanja Bäumel , Simon Ostermann , Josef van Genabith

Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings

Text classification is crucial for applications such as sentiment analysis and toxic text filtering, but it still faces challenges due to the complexity and ambiguity of natural language. Recent advancements in deep learning, particularly…

Computation and Language · Computer Science 2024-08-29 Lingyu Gao

Densely Connected Bidirectional LSTM with Applications to Sentence Classification

Deep neural networks have recently been shown to achieve highly competitive performance in many computer vision tasks due to their abilities of exploring in a much larger hypothesis space. However, since most deep architectures like stacked…

Computation and Language · Computer Science 2018-02-06 Zixiang Ding , Rui Xia , Jianfei Yu , Xiang Li , Jian Yang

Multiple Word Embeddings for Increased Diversity of Representation

Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand.…

Computation and Language · Computer Science 2020-10-13 Brian Lester , Daniel Pressel , Amy Hemmeter , Sagnik Ray Choudhury , Srinivas Bangalore

Multilingual Distributed Representations without Word Alignment

Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…

Computation and Language · Computer Science 2014-03-21 Karl Moritz Hermann , Phil Blunsom