Related papers: Learning Topic-Sensitive Word Representations

Improving Topic Models with Latent Feature Word Representations

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks. In this paper, we extend two…

Computation and Language · Computer Science 2018-10-16 Dat Quoc Nguyen , Richard Billingsley , Lan Du , Mark Johnson

Dirichlet belief networks for topic structure learning

Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures. Although several deep models have been proposed to learn better topic proportions of documents, how to…

Information Retrieval · Computer Science 2018-11-05 He Zhao , Lan Du , Wray Buntine , Mingyuan Zhou

Cross-topic distributional semantic representations via unsupervised mappings

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a…

Computation and Language · Computer Science 2019-04-12 Eleftheria Briakou , Nikos Athanasiou , Alexandros Potamianos

Multilingual Distributed Representations without Word Alignment

Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…

Computation and Language · Computer Science 2014-03-21 Karl Moritz Hermann , Phil Blunsom

Category Enhanced Word Embedding

Distributed word representations have been demonstrated to be effective in capturing semantic and syntactic regularities. Unsupervised representation learning from large unlabeled corpora can learn similar representations for those words…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

Distributed Representations for Compositional Semantics

The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional…

Computation and Language · Computer Science 2014-11-13 Karl Moritz Hermann

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec

Distributed dense word vectors have been shown to be effective at capturing token-level semantic and syntactic regularities in language, while topic models can form interpretable representations over documents. In this work, we describe…

Computation and Language · Computer Science 2016-05-09 Christopher E Moody

A Probabilistic Framework for Learning Domain Specific Hierarchical Word Embeddings

The meaning of a word often varies depending on its usage in different domains. The standard word embedding models struggle to represent this variation, as they learn a single global representation for a word. We propose a method to learn…

Computation and Language · Computer Science 2019-10-22 Lahari Poddar , Gyorgy Szarvas , Lea Frermann

Multi Sense Embeddings from Topic Models

Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due to their success in capturing useful semantic information. These representations assign only a single vector to each word whereas a large…

Machine Learning · Computer Science 2020-02-04 Shobhit Jain , Sravan Babu Bodapati , Ramesh Nallapati , Anima Anandkumar

Dynamic Topic Modeling with a Higher-Order Hypergraphical Representation

Dynamic topic modeling is widely used to analyze evolving trends in scientific literature, medical records, and social media. Traditional topic models represent each topic through a single probability vector on the multinomial simplex and…

Machine Learning · Computer Science 2026-05-28 Hanjia Gao , Hanwen Ye , Qing Nie , Annie Qu

Semantic Representations of Word Senses and Concepts

Representing the semantics of linguistic items in a machine-interpretable form has been a major goal of Natural Language Processing since its earliest days. Among the range of different linguistic items, words have attracted the most…

Computation and Language · Computer Science 2016-08-04 José Camacho-Collados , Ignacio Iacobacci , Roberto Navigli , Mohammad Taher Pilehvar

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general…

Computation and Language · Computer Science 2018-04-03 Sandeep Subramanian , Adam Trischler , Yoshua Bengio , Christopher J Pal

Hierarchical Interpretation of Neural Text Classification

Recent years have witnessed increasing interests in developing interpretable models in Natural Language Processing (NLP). Most existing models aim at identifying input features such as words or phrases important for model predictions.…

Computation and Language · Computer Science 2022-08-10 Hanqi Yan , Lin Gui , Yulan He

Definition Modeling: Learning to define word embeddings in natural language

Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks. But, these tasks only evaluate lexical semantics indirectly. In this…

Computation and Language · Computer Science 2016-12-02 Thanapon Noraset , Chen Liang , Larry Birnbaum , Doug Downey

Searching for Discriminative Words in Multidimensional Continuous Feature Space

Word feature vectors have been proven to improve many NLP tasks. With recent advances in unsupervised learning of these feature vectors, it became possible to train it with much more data, which also resulted in better quality of learned…

Computation and Language · Computer Science 2022-11-29 Marius Sajgalik , Michal Barla , Maria Bielikova

Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content

We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a…

Computation and Language · Computer Science 2016-06-29 Nemanja Djuric , Hao Wu , Vladan Radosavljevic , Mihajlo Grbovic , Narayan Bhamidipati

Learning Distributed Representations of Sentences from Unlabelled Data

Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This…

Computation and Language · Computer Science 2016-02-11 Felix Hill , Kyunghyun Cho , Anna Korhonen

Top2Vec: Distributed Representations of Topics

Topic modeling is used for discovering latent semantic structure, usually referred to as topics, in a large collection of documents. The most widely used methods are Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis.…

Computation and Language · Computer Science 2020-08-24 Dimo Angelov

Non-distributional Word Vector Representations

Data-driven representation learning for words is a technique of central importance in NLP. While indisputably useful as a source of features in downstream tasks, such vectors tend to consist of uninterpretable components whose relationship…

Computation and Language · Computer Science 2015-06-18 Manaal Faruqui , Chris Dyer

Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning

Deep compositional models of meaning acting on distributional representations of words in order to produce vectors of larger text constituents are evolving to a popular area of NLP research. We detail a compositional distributional…

Computation and Language · Computer Science 2015-08-14 Jianpeng Cheng , Dimitri Kartsaklis