Related papers: Document Informed Neural Autoregressive Topic Mode…

Document Informed Neural Autoregressive Topic Models with Distributional Prior

We address two challenges in topic models: (1) Context information around words helps in determining their actual meaning, e.g., "networks" used in the contexts "artificial neural networks" vs. "biological neuron networks". Generative topic…

Computation and Language · Computer Science 2019-01-16 Pankaj Gupta , Yatin Chaudhary , Florian Buettner , Hinrich Schütze

Neural Dynamic Focused Topic Model

Topic models and all their variants analyse text by learning meaningful representations through word co-occurrences. As pointed out by Williamson et al. (2010), such models implicitly assume that the probability of a topic to be active and…

Computation and Language · Computer Science 2023-01-27 Kostadin Cvejoski , Ramsés J. Sánchez , César Ojeda

Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret.…

Computation and Language · Computer Science 2021-06-18 Federico Bianchi , Silvia Terragni , Dirk Hovy

Explainable and Discourse Topic-aware Neural Language Understanding

Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate…

Computation and Language · Computer Science 2023-06-28 Yatin Chaudhary , Hinrich Schütze , Pankaj Gupta

Topic Modeling based on Keywords and Context

Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference…

Computation and Language · Computer Science 2018-02-06 Johannes Schneider

Keyword-based Topic Modeling and Keyword Selection

Certain type of documents such as tweets are collected by specifying a set of keywords. As topics of interest change with time it is beneficial to adjust keywords dynamically. The challenge is that these need to be specified ahead of…

Machine Learning · Statistics 2020-01-23 Xingyu Wang , Lida Zhang , Diego Klabjan

Topically Driven Neural Language Model

Language models are typically applied at the sentence level, without access to the broader document context. We present a neural language model that incorporates document context in the form of a topic model-like architecture, thus…

Computation and Language · Computer Science 2017-10-16 Jey Han Lau , Timothy Baldwin , Trevor Cohn

Generalized Topic Modeling

Recently there has been significant activity in developing algorithms with provable guarantees for topic modeling. In standard topic models, a topic (such as sports, business, or politics) is viewed as a probability distribution $\vec a_i$…

Machine Learning · Computer Science 2016-11-07 Avrim Blum , Nika Haghtalab

Latent Topic Models for Hypertext

Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collection such as the Internet, there has also been great interest…

Information Retrieval · Computer Science 2012-06-18 Amit Gruber , Michal Rosen-Zvi , Yair Weiss

A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated…

Computation and Language · Computer Science 2020-08-12 Lixing Zhu , Yulan He , Deyu Zhou

Distraction-Based Neural Networks for Document Summarization

Distributed representation learned with neural networks has recently shown to be effective in modeling natural languages at fine granularities such as words, phrases, and even sentences. Whether and how such an approach can be extended to…

Computation and Language · Computer Science 2016-10-27 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang

Topic Modelling Meets Deep Neural Networks: A Survey

Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred…

Machine Learning · Computer Science 2021-03-02 He Zhao , Dinh Phung , Viet Huynh , Yuan Jin , Lan Du , Wray Buntine

Deeper Text Understanding for IR with Contextual Neural Language Modeling

Neural networks provide new possibilities to automatically learn complex language patterns and query-document relations. Neural IR models have achieved promising results in learning query-document relevance patterns, but few explorations…

Information Retrieval · Computer Science 2019-05-23 Zhuyun Dai , Jamie Callan

Neural Models for Documents with Metadata

Most real-world document collections involve various types of metadata, such as author, source, and date, and yet the most commonly-used approaches to modeling text corpora ignore this information. While specialized models have been…

Machine Learning · Statistics 2018-10-25 Dallas Card , Chenhao Tan , Noah A. Smith

Improving Topic Models with Latent Feature Word Representations

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks. In this paper, we extend two…

Computation and Language · Computer Science 2018-10-16 Dat Quoc Nguyen , Richard Billingsley , Lan Du , Mark Johnson

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior

We address two challenges of probabilistic topic modelling in order to better estimate the probability of a word in a given context, i.e., P(word|context): (1) No Language Structure in Context: Probabilistic topic models ignore word order…

Computation and Language · Computer Science 2019-02-26 Pankaj Gupta , Yatin Chaudhary , Florian Buettner , Hinrich Schütze

Recurrent Hierarchical Topic-Guided RNN for Language Generation

To simultaneously capture syntax and global semantics from a text corpus, we propose a new larger-context recurrent neural network (RNN) based language model, which extracts recurrent hierarchical semantic structure via a dynamic deep topic…

Computation and Language · Computer Science 2020-06-30 Dandan Guo , Bo Chen , Ruiying Lu , Mingyuan Zhou

A Generative Model of Words and Relationships from Multiple Sources

Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this…

Computation and Language · Computer Science 2015-12-04 Stephanie L. Hyland , Theofanis Karaletsos , Gunnar Rätsch

Discovering Discrete Latent Topics with Neural Variational Inference

Topic models have been widely explored as probabilistic generative models of documents. Traditional inference methods have sought closed-form derivations for updating the models, however as the expressiveness of these models grows, so does…

Computation and Language · Computer Science 2018-05-23 Yishu Miao , Edward Grefenstette , Phil Blunsom

Topic Modeling in Embedding Spaces

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…

Information Retrieval · Computer Science 2019-07-12 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei