Related papers: Contextually Propagated Term Weights for Document …

Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection

One of the long-standing challenges in lexical semantics consists in learning representations of words which reflect their semantic properties. The remarkable success of word embeddings for this purpose suggests that high-quality…

Computation and Language · Computer Science 2021-06-16 Yixiao Wang , Zied Bouraoui , Luis Espinosa Anke , Steven Schockaert

Rehabilitation of Count-based Models for Word Vector Representations

Recent works on word representations mostly rely on predictive models. Distributed word representations (aka word embeddings) are trained to optimally predict the contexts in which the corresponding words tend to appear. Such models have…

Computation and Language · Computer Science 2015-04-10 Rémi Lebret , Ronan Collobert

Category Enhanced Word Embedding

Distributed word representations have been demonstrated to be effective in capturing semantic and syntactic regularities. Unsupervised representation learning from large unlabeled corpora can learn similar representations for those words…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

Balancing the composition of word embeddings across heterogenous data sets

Word embeddings capture semantic relationships based on contextual information and are the basis for a wide variety of natural language processing applications. Notably these relationships are solely learned from the data and subsequently…

Computation and Language · Computer Science 2020-01-15 Stephanie Brandl , David Lassner , Maximilian Alber

Definition Modeling: Learning to define word embeddings in natural language

Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks. But, these tasks only evaluate lexical semantics indirectly. In this…

Computation and Language · Computer Science 2016-12-02 Thanapon Noraset , Chen Liang , Larry Birnbaum , Doug Downey

Learning to Embed Words in Context for Syntactic Tasks

We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic…

Computation and Language · Computer Science 2017-06-13 Lifu Tu , Kevin Gimpel , Karen Livescu

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an…

Computation and Language · Computer Science 2018-07-11 Vincent Major , Alisa Surkis , Yindalon Aphinyanaphongs

Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

Distributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model…

Computation and Language · Computer Science 2018-02-20 Abhik Jana , Pawan Goyal

Context Vectors are Reflections of Word Vectors in Half the Dimensions

This paper takes a step towards theoretical analysis of the relationship between word embeddings and context embeddings in models such as word2vec. We start from basic probabilistic assumptions on the nature of word vectors, context…

Machine Learning · Statistics 2019-02-27 Zhenisbek Assylbekov , Rustem Takhanov

A Mixture Model for Learning Multi-Sense Word Embeddings

Word embeddings are now a standard technique for inducing meaning representations for words. For getting good representations, it is important to take into account different senses of a word. In this paper, we propose a mixture model for…

Computation and Language · Computer Science 2017-08-14 Dai Quoc Nguyen , Dat Quoc Nguyen , Ashutosh Modi , Stefan Thater , Manfred Pinkal

Novel Ranking-Based Lexical Similarity Measure for Word Embedding

Distributional semantics models derive word space from linguistic items in context. Meaning is obtained by defining a distance measure between vectors corresponding to lexical entities. Such vectors present several problems. In this paper…

Computation and Language · Computer Science 2017-12-25 Jakub Dutkiewicz , Czesław Jędrzejek

Towards a Theoretical Understanding of Word and Relation Representation

Representing words by vectors, or embeddings, enables computational reasoning and is foundational to automating natural language tasks. For example, if word embeddings of similar words contain similar values, word similarity can be readily…

Computation and Language · Computer Science 2022-02-02 Carl Allen

Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words,…

Computation and Language · Computer Science 2016-07-25 Kuan-Yu Chen , Shih-Hung Liu , Berlin Chen , Hsin-Min Wang , Hsin-Hsi Chen

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be…

Computation and Language · Computer Science 2017-06-22 Massimiliano Mancini , Jose Camacho-Collados , Ignacio Iacobacci , Roberto Navigli

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…

Machine Learning · Computer Science 2022-03-16 Dongsheng Wang , Dandan Guo , He Zhao , Huangjie Zheng , Korawat Tanwisuth , Bo Chen , Mingyuan Zhou

Machine Translation with Cross-lingual Word Embeddings

Learning word embeddings using distributional information is a task that has been studied by many researchers, and a lot of studies are reported in the literature. On the contrary, less studies were done for the case of multiple languages.…

Computation and Language · Computer Science 2020-04-15 Marco Berlot , Evan Kaplan

From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings

In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision. After each word in a collection of documents is…

Computation and Language · Computer Science 2017-07-26 Andrei M. Butnaru , Radu Tudor Ionescu

Imparting Interpretability to Word Embeddings while Preserving Semantic Structure

As an ubiquitous method in natural language processing, word embeddings are extensively employed to map semantic properties of words into a dense vector representation. They capture semantic and syntactic relations among words but the…

Computation and Language · Computer Science 2020-07-03 Lutfi Kerem Senel , Ihsan Utlu , Furkan Şahinuç , Haldun M. Ozaktas , Aykut Koç

Learning Word Embeddings from Intrinsic and Extrinsic Views

While word embeddings are currently predominant for natural language processing, most of existing models learn them solely from their contexts. However, these context-based word embeddings are limited since not all words' meaning can be…

Computation and Language · Computer Science 2016-08-23 Jifan Chen , Kan Chen , Xipeng Qiu , Qi Zhang , Xuanjing Huang , Zheng Zhang

Generative Topic Embedding: a Continuous Representation of Documents (Extended Version with Proofs)

Word embedding maps words into a low-dimensional continuous embedding space by exploiting the local word collocation patterns in a small context window. On the other hand, topic modeling maps documents onto a low-dimensional topic space, by…

Computation and Language · Computer Science 2016-08-09 Shaohua Li , Tat-Seng Chua , Jun Zhu , Chunyan Miao